Showing posts with label preg-match. Show all posts

Sunday, November 20, 2022

[FIXED] How to convert input string to desired(given) format using regex or arrays?

November 20, 2022 php, preg-match, preg-replace, regex No comments

Issue

I have input string coming from user like this

" |appl | pinea | orang frui | vege lates"

and I want to convert it to this format

"+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"

I have tried with preg_match and arrays but unable to find solution

$term = " |appl | pinea | orang frui | vege lates";

$term = trim($term, "| ");

$term = preg_replace("#[\| ]{2,}#", " | ", $term);

// $term value is "appl | pinea | orang frui | vege lates" after applying safe guards

// $matches = [];
// $matched = preg_match_all("#(\P{Xan}+)\|(\P{Xan}+)#ui", $term, $matches);
// var_dump($matches);

$termArray = explode(" ", $term);

foreach($termArray as $index => $singleWord) {
   $termArray[$index] = trim($singleWord);
}

$termArray = array_filter($termArray);

foreach($termArray as $index => $singleWord) {
   if($singleWord === "|") {

       $pipeIndexes[] = $index;
       $orIndexes[] = $index - 1;
       $orIndexes[] = $index + 1;

       unset($termArray[$index]);

    }

}

// desired string I am hoping for
// "+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"

desired string I am hoping for "+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"

Solution

This assumes you have cleaned up your input string to the following:

appl | pinea | orang frui | vege lates

Quick working example

https://3v4l.org/NjIai

<?php

$subject = 'appl | pinea | orang frui | vege lates';

// Add the "*" in place at end
$subject = preg_replace('/[a-z]+/i', '$0*', $subject);

// Group the terms adding brackets and + sign
$result = preg_replace('/(?:[^\s|]+\s*\|\s*)*(?:[^\s|]+)/i', '+($0)', $subject);

var_dump($result);
// Outputs:
// string(53) "+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"

Add in the `*` after each word:

$result = preg_replace('/[a-z]+/i', '$0*', $subject);

Group based on space

$result = preg_replace('/(?:[^\s|]+\s*\|\s*)*(?:[^\s|]+)/i', '+($0)', $subject);

Visualisation

Human Readable

(?:[^\s|]+\s*\|\s*)*(?:[^\s|]+)

Options: Case insensitive; Exact spacing; Dot doesn’t match line breaks; ^$ don’t match at line breaks; Greedy quantifiers; Regex syntax only

Match the regular expression below (?:[^\s|]+\s*\|\s*)*
- Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
- Match any single character NOT present in the list below [^\s|]+
  - Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
  - A “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s
  - The literal character “|” |
- Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s*
  - Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
- Match the character “|” literally \|
- Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s*
  - Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
Match the regular expression below (?:[^\s|]+)
- Match any single character NOT present in the list below [^\s|]+
  - Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
  - A “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s
  - The literal character “|” |

+($0)

Insert the character “+” literally +
Insert an opening parenthesis (
Insert the whole regular expression match $0
Insert a closing parenthesis )

Answered By - Dean Taylor

Answer Checked By - Dawn Plyler (PHPFixing Volunteer)

[FIXED] how to remove multiple slashes in URI with 'PREG' or 'HTACCESS'

November 20, 2022 .htaccess, php, preg-match, preg-replace, preg-split No comments

Issue

how to remove multiple slashes in URI with 'PREG' or 'HTACCESS'

site.com/edition/new/// -> site.com/edition/new/

site.com/edition///new/ -> site.com/edition/new/

thanks

Solution

using the plus symbol + in regex means the occurrence of one or more of the previous character. So we can add it in a preg_replace to replace the occurrence of one or more / by just one of them

   $url =  "site.com/edition/new///";

$newUrl = preg_replace('/(\/+)/','/',$url);

// now it should be replace with the correct single forward slash
echo $newUrl

Answered By - Ibu

Answer Checked By - Candace Johnson (PHPFixing Volunteer)

[FIXED] how to do a preg_replace in twig

November 20, 2022 preg-match, preg-replace, symfony, twig No comments

Issue

I currently am trying to make a variable using the current url for the view. To get the url, I am using {% set uri = app.request.uri %} (uri being my variable which is the current url for that particular view). The thing is, I am only interested in what makes the url unique (the end of it - a unique object from an array - happens to be a uri), and not the beginning (path to my application). I was thinking I could use a preg_replace to do so, but TWIG doesn't have this function. Just wondering if someone would know how to accomplish what I am trying to do?

I'm new to Symfony (and fairly new to PHP), so my explanations may not be clear (sorry).

Ex.

{% set uri = app.request.uri %}

output: http://website.com/http://item.org/1

I want to modify the uri variable to ONLY have http://item.org/1 (and not the path to my website).

I'm thinking creating a Twig Extension with the preg_replace will allow me to do this ..but not sure if it's the best way to go (inexperienced).

Overall goal: The unique value for "uri" in the view is appended to the websites path by another view from an array of objects ($results) with attributes, one being "uri". My ultimate goal is to only display all associated attributes (or row) for an object in my $results array. I was thinking I could do this by first creating a key (my uri variable) in a foreach, and returning the row in the array which matches this key. This is why I am trying to create a variable with the url so that I can use it as a key for my foreach loop to iterate over $results. I am NOT using a database or Doctrine.

Thank you ahead of time for the help!

Solution

The best way is to move the logic from template to the controller.

If you need preg_replace in twig you must create custom extension.

Answered By - Alexey B.

Answer Checked By - Robin (PHPFixing Admin)

[FIXED] How can I preg_match script tag src, but avoid effecting img tag src?

November 20, 2022 php, preg-match, preg-replace No comments

Issue

I have to match local src's and make them load via the web. Example:

src="/js/my.js">

Becomes:

src="http://cdn.example.com/js/my.js">

This is what I have now:

if (!preg_match("#<script(.+?) src=\"http#i",$page)){ 
$page = preg_replace("#<script(.+?) src=\"#is", "<script$1 src=\"$workingUrl", $page); 
}

It works fine when it encounters something like this:

<script type='text/javascript' src='/wp-includes/js/jquery/jquery.js?ver=1.8.3'></script>

It fails when it encounters something like this:

<script language="JavaScript">
window.moveTo(0,0);
window.resizeTo(screen.width,screen.height);
</script>

If the script tag doesn't contain a src it will then find the src of the first image tag and switch out its URL.

I need to know how to get it to terminate the match on the script tag only and/or how to perform the replacement better.

Solution

Definitely use a DOM parser. Xpath with DOMDocument will cleanly, reliably replace the script tags that:

Have a src attribute and
The src attribute does not start with http.

I could have further developed the xpath query expression to check for the leading http substring, but I didn't want to scare you off with more syntax.

Code: (Demo)

$html = <<<HTML
<html>
<head>
<script type='text/javascript' src='/wp-includes/js/jquery/jquery.js?ver=1.8.3'></script>
<script language="JavaScript">
window.moveTo(0,0);
window.resizeTo(screen.width,screen.height);
</script>
</head>
</html>
HTML;

$workingUrl = 'https://www.example.com';

$dom = new DOMDocument; 
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//script[@src]") as $node) {
    if (strpos($node->getAttribute('src'), 'http') !== 0) {
        $node->setAttribute('src', $workingUrl);        
    }
}
echo $dom->saveHTML();

Output:

<html>
<head>
<script type="text/javascript" src="https://www.example.com"></script>
<script language="JavaScript">
window.moveTo(0,0);
window.resizeTo(screen.width,screen.height);
</script>
</head>
</html>

The only slightly "scarier" xpath version: (Demo)

$dom = new DOMDocument; 
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//script[@src and not(starts-with(@src,'http'))]") as $node) {
    $node->setAttribute('src', $workingUrl);        
}
echo $dom->saveHTML();

Answered By - mickmackusa

Answer Checked By - Timothy Miller (PHPFixing Admin)

[FIXED] How to convert a number of spaces(if more than one) to single dash using PHP

November 20, 2022 php, preg-match, preg-replace, replace, string No comments

Issue

I can convert Space to dash.

$string = str_replace(' ', '-', $string);

But I want to convert a number of spaces; two, three, four, five, six or soon.. spaces into single dash

into -

Solution

Try this regex using preg_replace

$string = preg_replace( '!\s+!', '-', $string );

Answered By - Amit

Answer Checked By - Willingham (PHPFixing Volunteer)

[FIXED] How to isolate content until the first double newline sequence?

November 20, 2022 php, preg-match, preg-replace, regex, substring No comments

Issue

<ul>
    <li><a href="#">Foo</a></li>
    <li><a href="#">Foo</a></li>
    <li><a href="#">Foo</a></li>
</ul>

<ul>
    <li><a href="#">Bar</a></li>
    <li><a href="#">Bar</a></li>
    <li><a href="#">Bar</a></li>
</ul>

How can I get any content until the first blank line?

NOTE: First and second part of the content doesn't always start with ul.

Solution

preg_match('/\A.*?(?=\s*^\s*$)/smx', $subject, $regs);
$result = $regs[0];

Explanation

preg_match(
    '/\A    # Start of string
    .*?     # Match any number of characters (as few as possible)
    (?=     # until it is possible to match...
     \s*    #  trailing whitespace, including a linebreak 
     ^      #  Start of line
     \s*    #  optional whitespace
     $      #  End of line
    )       # (End of lookahead assertion)/smx', 
    $subject, $regs);
$result = $regs[0];

assuming that you count lines that contain nothing but whitespace as blank lines. If not, remove the "optional whitespace" line.

Answered By - Tim Pietzcker

Answer Checked By - Cary Denson (PHPFixing Admin)

[FIXED] When do I need u-modifier in PHP regex?

November 20, 2022 pcre, php, preg-match, preg-replace, utf-8 No comments

Issue

I know, that PHP PCRE functions treat strings as byte sequences, so many sites suggest to use /u modifier for handling input and regex as UTF-8.

But, do I really need this always? My tests show, that this flag makes no difference, when I don't use escape sequences or dot or something like this.

For example

preg_match('/^[\da-f]{40}$/', $string); to check if string has format of a SHA1 hash

preg_replace('/[^a-zA-Z0-9]/', $spacer, $string); to replace every char that is non-ASCII letter or number

preg_replace('/^\+$(.*)$$/', '\1', $string); for getting inner content of +(XYZ)

These regex contain only single byte ASCII symbols, so it should work on every input, regardless of encoding, shouldn't it? Note that third regex uses dot operator, but as I cut off some ASCII chars at beginning and end of string, this should work on UTF-8 also, correct?

Cannot anyone tell me, if I'm overlooking something?

Solution

There is no problem with the first expression. The characters being quantified are explicitly single-byte, and cannot occur in a UTF-8 multibyte sequence.

The second expression may give you more spacers than you expect; for example:

echo preg_replace('/[^a-zA-Z0-9]/', "0", "💩");
// => 0000

The third expression also does not pose a problem, as the repeated character is limited by parentheses (which is ASCII-safe).

This is more dangerous:

echo preg_replace('/^(.)/', "0", "💩");
// => 0???

Typically, without knowing more about how UTF-8 works, it may be tricky to predict which regexps are safe, and which are not, so using /u for all text that might contain a character above U+007F is the best practice.

Answered By - Amadan

Answer Checked By - Candace Johnson (PHPFixing Volunteer)

[FIXED] How to write preg_match for a date followed by specific string?

November 20, 2022 php, preg-match, preg-replace, regex No comments

Issue

I want to extract date from several HTML documents. The date always follow this pattern:

Starting three alphabets of month with first character in uppercase i-e Jan.
Two digit numerical characters of day of the month i-e 09
A comma as a separater
Four digit numerical characters of year i-e 2022.

Sample of complete date is Jan 09, 2022

I want to extract only those dates which are wraped in span tags. So, the complete pattern is

<span>Jan 09, 2022</span>

I am not good at writing preg_match. Can anyone please help me?

Solution

<span>(\w{3} \d{1,2}, \w{4})<\/span>

\w is a meta-character for the set [a-zA-Z0-9_].

{3} means thrice.

\d is a meta-character for the set [0-9].

{1,2} means once or twice.

Try it https://regex101.com/r/tNRa73/1

$pattern = '/<span>(\w{3} \d{1,2}, \w{4})<\/span>/'; 

preg_match(
  $pattern,
  $html,
  $matches // <-- The results will be added to this new variable.
);

$matches[1]; // The date will be in the first index because it was
             // the first "capture group" i.e set of parens.


// If you expect multiple dates in one HTML document, then use:
preg_match_all(
  $pattern,
  $html,
  $matches
);

$matches[1]; // Now, the first index is an array of matches of
             // the first "capture group".

Answered By - rhinosforhire

Answer Checked By - Terry (PHPFixing Volunteer)

[FIXED] How to disallow repeated numbers in telephone field like 1111111 or 222222

November 06, 2022 contacts, phone-number, preg-match No comments

Issue

I am using contact form 7 and want to disallow users to enter repeated numbers like 1111111 or 2222222 in phone field.

I am using below code to enter only 10 digits. Can anyone help me with what I should change or add in this to work.

// define the wpcf7_is_tel callback<br> 
function custom_filter_wpcf7_is_tel( $result, $tel ) {<br> 
  $result = preg_match( '/^\(?\+?([0-9]{0})?\)?[-\. ]?(\d{10})$/', $tel );<br>
  return $result; <br>
}<br>
         
add_filter( 'wpcf7_is_tel', 'custom_filter_wpcf7_is_tel', 10, 2 );

Solution

First of all, [0-9]{0} looks like a typo as this pattern matches nothing, an empty string. You probably wanted to match an optional area code, three digits. So, use \d{3} if you meant that.

Next, to disallow same digits within those you match with \d{10}, you simply need to re-write it as (?!(\d)\1{9})\d{10}.

Bearing in mind what was said above, the solution is

function custom_filter_wpcf7_is_tel( $result, $tel ) {<br> 
  return preg_match('/^\(?\+?(?:\d{3})?\)?[-. ]?(?!(\d)\1{9})\d{10}$/', $tel );<br>
}

See the regex demo.

Answered By - Wiktor Stribiżew

Answer Checked By - Marie Seifert (PHPFixing Admin)

[FIXED] Why is preg_match() with pattern "75.1." and subject "75.142.3.8" returning a match in PHP?

February 06, 2022 php, preg-match, regex No comments

Issue

I am testing the matching of a function I am using to check IP ranges.

$a = "75.1.";
$b = "75.142.3.8";
$num = preg_match("/^$a/", $b, $matches);
echo "$a<br/>\n$b<br/>\n$num";

This is returning a $num of 1 for me. Why is this returning a match, despite "1." not matching "142"? Am I making some error with regular expressions? strstr($b, $a) and str_starts_with($b, $a) are both returning FALSE, as I was expecting.

Solution

. is a wild character, you'll need to escape it with a \

$a = "75\.1\.";

Answered By - Optimum

[FIXED] PHP: Check for characters in the Latin script plus spaces and numbers

January 22, 2022 php, preg-match, regex No comments

Issue

I am new to regex and I have been going round and round on this problem.

PHP: Check alphabetic characters from any latin-based language? gives the brilliant regex to check for any characters in the Latin script, which is part of what I need.

^\p{Latin}+$

and provides a working example at https://regex101.com/r/I5b2mC/1

If I use the regex in PHP by using

echo preg_match('/^\p{Latin}+$/', $testString);

and $testString contains only Latin letters, the output will be 1. If there is any non-Latin letters, the output will be 0. Brilliant.

To add numbers in I tried ^\p{Latin}+[[:alnum:]]*$ but that allows any characters in the Latin script OR non-Latin letters and numbers (letters without accents — grave, acute, cedilla, umlaut etc.) as it is the equivalent to [a-zA-Z0-9].

If you add any numbers with characters in the Latin script, echo preg_match('/^\p{Latin}+[[:alnum:]]*$/', $testString); returns a 0. All numbers return a 0 too. This can be confirmed by editing the expression in https://regex101.com/r/I5b2mC/1

How do I edit the expression in echo preg_match('/^\p{Latin}+$/', $testString); to output a 1 if there are any characters in the Latin script, any numbers and/or spaces in $testString? For example, I wish for a 1 to be output if $testString is Café ßüs 459.

Solution

There are at least two things to change:

Add u flag to support chars other than ASCII (/^\p{Latin}+$/ => /^[\p{Latin}]+$/u)
Create a character class for letters, digits and whitespace patterns (/^\p{Latin}+$/u => ^[\p{Latin}]+$/u)
Then add the digit and whitespace patterns. If you need to support any Unicode digits, add \d. If you need to support only ASCII digits, add 0-9.

Thus, you can use

preg_match('/^[\p{Latin}\s0-9]+$/u', $testString) // ASCII only digits
preg_match('/^[\p{Latin}\s\d]+$/u', $testString)  // Any digits

Also, \s with u flag will match any Unicode whitespace chars.

Answered By - Wiktor Stribiżew

Sunday, November 20, 2022

Issue

Solution

Quick working example

Add in the * after each word:

Group based on space

Visualisation

Human Readable

Issue

Solution

Issue

Solution

Issue

Solution

Issue

Solution

Issue

Solution

Issue

Solution

Issue

Solution

Sunday, November 6, 2022

Issue

Solution

Sunday, February 6, 2022

Issue

Solution

Saturday, January 22, 2022

Issue

Solution

Total Pageviews

Featured Post

Subscribe To

Add in the `*` after each word: