PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Sunday, November 20, 2022

[FIXED] How to replace spaces inside of HTML tags while keeping the tags in PHP (preg_replace)?

 November 20, 2022     html, php, preg-replace, regex     No comments   

Issue

Let's say I have this string:

$string = '<p > ¡Esto es una prueba! < /p > <p> <strong > Prueba 123 </strong> </p> <p> <strong> < a href="https://matricom.net"> MATRICOM < / a> </ strong> </p> <p> <strong > Todas las pruebas aquí ... </strong > < /p>'

What I want to do is fix the HTML tags using PHP (they are malformed due to the spaces). I have tried several different regex expressions that I have found online such as this:

$html = trim(preg_replace('/<\s+>/', '<>', $text));

and:

$html = preg_replace('/<(.+?)(?:»| |″)(.+?)>/', '<\1\2>', $text);

I am attempting to get a string output like this (spaces removed in front part and end part of HTML tags):

'<p> ¡Esto es una prueba! </p> <p> <strong> Prueba 123 </strong> </p> <p> <strong> <a href="https://matricom.net"> MATRICOM </a> </strong> </p> <p> <strong> Todas las pruebas aquí ... </strong> </p>'

Backstory: Google Translate has the tendency to add random spaces in translation results which affect HTML structure. Just looking for a quick way to clean the tags up. I have been searching for two days how to do this and can't seem to find anything that fits quite what I'm looking for.


Solution

In a most general case, you may use a preg_replace_callback solution:

$text='<p > ¡Esto es una prueba! < /p > <p> <strong > Prueba 123 </strong> </p> <p> <strong> <a href="https://matricom.net"> MATRICOM < / a> </ strong> </p> <p> <strong > Todas las pruebas aquí ... </strong > < /p>';
echo preg_replace_callback('~<[^<>]+>~u', function($m) { 
    return str_replace(' ', '', $m[0]); 
  // or,  preg_replace('~\s+~u', '', $m[0]); 
}, $text);

See the PHP demo.

However, you might want to create a pattern to only match the tags that are really used in Google Translate output. For a, p and strong tags it will look like

'~<\s*(?:/\s*)?(?:p|a|strong)\s*>~u'

See this regex demo

Details

  • < - < char
  • \s* - 0+ whitespaces
  • (?:/\s*)? - an optional sequence of / and then 0+ whitespaces
  • (?:p|a|strong) - p, a or strong substrings
  • \s* - 0+ whitespaces
  • > - a > char.


Answered By - Wiktor Stribiżew
Answer Checked By - Dawn Plyler (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing