PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0
Showing posts with label preg-match. Show all posts
Showing posts with label preg-match. Show all posts

Sunday, November 20, 2022

[FIXED] How to convert input string to desired(given) format using regex or arrays?

 November 20, 2022     php, preg-match, preg-replace, regex     No comments   

Issue

I have input string coming from user like this

" |appl | pinea | orang frui | vege lates"

and I want to convert it to this format

"+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"

I have tried with preg_match and arrays but unable to find solution

$term = " |appl | pinea | orang frui | vege lates";

$term = trim($term, "| ");

$term = preg_replace("#[\| ]{2,}#", " | ", $term);

// $term value is "appl | pinea | orang frui | vege lates" after applying safe guards

// $matches = [];
// $matched = preg_match_all("#(\P{Xan}+)\|(\P{Xan}+)#ui", $term, $matches);
// var_dump($matches);

$termArray = explode(" ", $term);

foreach($termArray as $index => $singleWord) {
   $termArray[$index] = trim($singleWord);
}

$termArray = array_filter($termArray);

foreach($termArray as $index => $singleWord) {
   if($singleWord === "|") {

       $pipeIndexes[] = $index;
       $orIndexes[] = $index - 1;
       $orIndexes[] = $index + 1;

       unset($termArray[$index]);

    }

}

// desired string I am hoping for
// "+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"

desired string I am hoping for "+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"


Solution

This assumes you have cleaned up your input string to the following:

appl | pinea | orang frui | vege lates

Quick working example

https://3v4l.org/NjIai

<?php

$subject = 'appl | pinea | orang frui | vege lates';

// Add the "*" in place at end
$subject = preg_replace('/[a-z]+/i', '$0*', $subject);

// Group the terms adding brackets and + sign
$result = preg_replace('/(?:[^\s|]+\s*\|\s*)*(?:[^\s|]+)/i', '+($0)', $subject);

var_dump($result);
// Outputs:
// string(53) "+(appl* | pinea* | orang*) +(frui* | vege*) +(lates*)"

Add in the * after each word:

$result = preg_replace('/[a-z]+/i', '$0*', $subject);

Group based on space

$result = preg_replace('/(?:[^\s|]+\s*\|\s*)*(?:[^\s|]+)/i', '+($0)', $subject);

Visualisation

Regex Visualisation

Human Readable

(?:[^\s|]+\s*\|\s*)*(?:[^\s|]+)

Options: Case insensitive; Exact spacing; Dot doesn’t match line breaks; ^$ don’t match at line breaks; Greedy quantifiers; Regex syntax only

  • Match the regular expression below (?:[^\s|]+\s*\|\s*)*
    • Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
    • Match any single character NOT present in the list below [^\s|]+
      • Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
      • A “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s
      • The literal character “|” |
    • Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s*
      • Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
    • Match the character “|” literally \|
    • Match a single character that is a “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s*
      • Between zero and unlimited times, as many times as possible, giving back as needed (greedy) *
  • Match the regular expression below (?:[^\s|]+)
    • Match any single character NOT present in the list below [^\s|]+
      • Between one and unlimited times, as many times as possible, giving back as needed (greedy) +
      • A “whitespace character” (any Unicode separator, tab, line feed, carriage return, vertical tab, form feed, next line) \s
      • The literal character “|” |

+($0)

  • Insert the character “+” literally +
  • Insert an opening parenthesis (
  • Insert the whole regular expression match $0
  • Insert a closing parenthesis )


Answered By - Dean Taylor
Answer Checked By - Dawn Plyler (PHPFixing Volunteer)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

[FIXED] how to remove multiple slashes in URI with 'PREG' or 'HTACCESS'

 November 20, 2022     .htaccess, php, preg-match, preg-replace, preg-split     No comments   

Issue

how to remove multiple slashes in URI with 'PREG' or 'HTACCESS'

site.com/edition/new/// -> site.com/edition/new/


site.com/edition///new/ -> site.com/edition/new/

thanks


Solution

using the plus symbol + in regex means the occurrence of one or more of the previous character. So we can add it in a preg_replace to replace the occurrence of one or more / by just one of them

   $url =  "site.com/edition/new///";

$newUrl = preg_replace('/(\/+)/','/',$url);

// now it should be replace with the correct single forward slash
echo $newUrl


Answered By - Ibu
Answer Checked By - Candace Johnson (PHPFixing Volunteer)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

[FIXED] how to do a preg_replace in twig

 November 20, 2022     preg-match, preg-replace, symfony, twig     No comments   

Issue

I currently am trying to make a variable using the current url for the view. To get the url, I am using {% set uri = app.request.uri %} (uri being my variable which is the current url for that particular view). The thing is, I am only interested in what makes the url unique (the end of it - a unique object from an array - happens to be a uri), and not the beginning (path to my application). I was thinking I could use a preg_replace to do so, but TWIG doesn't have this function. Just wondering if someone would know how to accomplish what I am trying to do?

I'm new to Symfony (and fairly new to PHP), so my explanations may not be clear (sorry).

Ex.

{% set uri = app.request.uri %}

output: http://website.com/http://item.org/1

I want to modify the uri variable to ONLY have http://item.org/1 (and not the path to my website).

I'm thinking creating a Twig Extension with the preg_replace will allow me to do this ..but not sure if it's the best way to go (inexperienced).


Overall goal: The unique value for "uri" in the view is appended to the websites path by another view from an array of objects ($results) with attributes, one being "uri". My ultimate goal is to only display all associated attributes (or row) for an object in my $results array. I was thinking I could do this by first creating a key (my uri variable) in a foreach, and returning the row in the array which matches this key. This is why I am trying to create a variable with the url so that I can use it as a key for my foreach loop to iterate over $results. I am NOT using a database or Doctrine.

Thank you ahead of time for the help!


Solution

The best way is to move the logic from template to the controller.

If you need preg_replace in twig you must create custom extension.



Answered By - Alexey B.
Answer Checked By - Robin (PHPFixing Admin)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

[FIXED] How can I preg_match script tag src, but avoid effecting img tag src?

 November 20, 2022     php, preg-match, preg-replace     No comments   

Issue

I have to match local src's and make them load via the web. Example:

src="/js/my.js">

Becomes:

src="http://cdn.example.com/js/my.js">

This is what I have now:

if (!preg_match("#<script(.+?) src=\"http#i",$page)){ 
$page = preg_replace("#<script(.+?) src=\"#is", "<script$1 src=\"$workingUrl", $page); 
}

It works fine when it encounters something like this:

<script type='text/javascript' src='/wp-includes/js/jquery/jquery.js?ver=1.8.3'></script>

It fails when it encounters something like this:

<script language="JavaScript">
window.moveTo(0,0);
window.resizeTo(screen.width,screen.height);
</script>

If the script tag doesn't contain a src it will then find the src of the first image tag and switch out its URL.

I need to know how to get it to terminate the match on the script tag only and/or how to perform the replacement better.


Solution

Definitely use a DOM parser. Xpath with DOMDocument will cleanly, reliably replace the script tags that:

  1. Have a src attribute and
  2. The src attribute does not start with http.

I could have further developed the xpath query expression to check for the leading http substring, but I didn't want to scare you off with more syntax.

Code: (Demo)

$html = <<<HTML
<html>
<head>
<script type='text/javascript' src='/wp-includes/js/jquery/jquery.js?ver=1.8.3'></script>
<script language="JavaScript">
window.moveTo(0,0);
window.resizeTo(screen.width,screen.height);
</script>
</head>
</html>
HTML;

$workingUrl = 'https://www.example.com';

$dom = new DOMDocument; 
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//script[@src]") as $node) {
    if (strpos($node->getAttribute('src'), 'http') !== 0) {
        $node->setAttribute('src', $workingUrl);        
    }
}
echo $dom->saveHTML();

Output:

<html>
<head>
<script type="text/javascript" src="https://www.example.com"></script>
<script language="JavaScript">
window.moveTo(0,0);
window.resizeTo(screen.width,screen.height);
</script>
</head>
</html>

The only slightly "scarier" xpath version: (Demo)

$dom = new DOMDocument; 
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//script[@src and not(starts-with(@src,'http'))]") as $node) {
    $node->setAttribute('src', $workingUrl);        
}
echo $dom->saveHTML();


Answered By - mickmackusa
Answer Checked By - Timothy Miller (PHPFixing Admin)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

[FIXED] How to convert a number of spaces(if more than one) to single dash using PHP

 November 20, 2022     php, preg-match, preg-replace, replace, string     No comments   

Issue

I can convert Space to dash.

$string = str_replace(' ', '-', $string);

But I want to convert a number of spaces; two, three, four, five, six or soon.. spaces into single dash

into -


Solution

Try this regex using preg_replace

$string = preg_replace( '!\s+!', '-', $string );


Answered By - Amit
Answer Checked By - Willingham (PHPFixing Volunteer)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

[FIXED] How to isolate content until the first double newline sequence?

 November 20, 2022     php, preg-match, preg-replace, regex, substring     No comments   

Issue

<ul>
    <li><a href="#">Foo</a></li>
    <li><a href="#">Foo</a></li>
    <li><a href="#">Foo</a></li>
</ul>

<ul>
    <li><a href="#">Bar</a></li>
    <li><a href="#">Bar</a></li>
    <li><a href="#">Bar</a></li>
</ul>

How can I get any content until the first blank line?

NOTE: First and second part of the content doesn't always start with ul.


Solution

preg_match('/\A.*?(?=\s*^\s*$)/smx', $subject, $regs);
$result = $regs[0];

Explanation

preg_match(
    '/\A    # Start of string
    .*?     # Match any number of characters (as few as possible)
    (?=     # until it is possible to match...
     \s*    #  trailing whitespace, including a linebreak 
     ^      #  Start of line
     \s*    #  optional whitespace
     $      #  End of line
    )       # (End of lookahead assertion)/smx', 
    $subject, $regs);
$result = $regs[0];

assuming that you count lines that contain nothing but whitespace as blank lines. If not, remove the "optional whitespace" line.



Answered By - Tim Pietzcker
Answer Checked By - Cary Denson (PHPFixing Admin)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

[FIXED] When do I need u-modifier in PHP regex?

 November 20, 2022     pcre, php, preg-match, preg-replace, utf-8     No comments   

Issue

I know, that PHP PCRE functions treat strings as byte sequences, so many sites suggest to use /u modifier for handling input and regex as UTF-8.

But, do I really need this always? My tests show, that this flag makes no difference, when I don't use escape sequences or dot or something like this.

For example

preg_match('/^[\da-f]{40}$/', $string); to check if string has format of a SHA1 hash

preg_replace('/[^a-zA-Z0-9]/', $spacer, $string); to replace every char that is non-ASCII letter or number

preg_replace('/^\+\((.*)\)$/', '\1', $string); for getting inner content of +(XYZ)

These regex contain only single byte ASCII symbols, so it should work on every input, regardless of encoding, shouldn't it? Note that third regex uses dot operator, but as I cut off some ASCII chars at beginning and end of string, this should work on UTF-8 also, correct?

Cannot anyone tell me, if I'm overlooking something?


Solution

There is no problem with the first expression. The characters being quantified are explicitly single-byte, and cannot occur in a UTF-8 multibyte sequence.

The second expression may give you more spacers than you expect; for example:

echo preg_replace('/[^a-zA-Z0-9]/', "0", "💩");
// => 0000

The third expression also does not pose a problem, as the repeated character is limited by parentheses (which is ASCII-safe).

This is more dangerous:

echo preg_replace('/^(.)/', "0", "💩");
// => 0???

Typically, without knowing more about how UTF-8 works, it may be tricky to predict which regexps are safe, and which are not, so using /u for all text that might contain a character above U+007F is the best practice.



Answered By - Amadan
Answer Checked By - Candace Johnson (PHPFixing Volunteer)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

[FIXED] How to write preg_match for a date followed by specific string?

 November 20, 2022     php, preg-match, preg-replace, regex     No comments   

Issue

I want to extract date from several HTML documents. The date always follow this pattern:

  1. Starting three alphabets of month with first character in uppercase i-e Jan.
  2. Two digit numerical characters of day of the month i-e 09
  3. A comma as a separater
  4. Four digit numerical characters of year i-e 2022.

Sample of complete date is Jan 09, 2022

I want to extract only those dates which are wraped in span tags. So, the complete pattern is

<span>Jan 09, 2022</span>

I am not good at writing preg_match. Can anyone please help me?


Solution

<span>(\w{3} \d{1,2}, \w{4})<\/span>

\w is a meta-character for the set [a-zA-Z0-9_].

{3} means thrice.

\d is a meta-character for the set [0-9].

{1,2} means once or twice.

Try it https://regex101.com/r/tNRa73/1

$pattern = '/<span>(\w{3} \d{1,2}, \w{4})<\/span>/'; 

preg_match(
  $pattern,
  $html,
  $matches // <-- The results will be added to this new variable.
);

$matches[1]; // The date will be in the first index because it was
             // the first "capture group" i.e set of parens.


// If you expect multiple dates in one HTML document, then use:
preg_match_all(
  $pattern,
  $html,
  $matches
);

$matches[1]; // Now, the first index is an array of matches of
             // the first "capture group".


Answered By - rhinosforhire
Answer Checked By - Terry (PHPFixing Volunteer)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

Sunday, November 6, 2022

[FIXED] How to disallow repeated numbers in telephone field like 1111111 or 222222

 November 06, 2022     contacts, phone-number, preg-match     No comments   

Issue

I am using contact form 7 and want to disallow users to enter repeated numbers like 1111111 or 2222222 in phone field.

I am using below code to enter only 10 digits. Can anyone help me with what I should change or add in this to work.

// define the wpcf7_is_tel callback<br> 
function custom_filter_wpcf7_is_tel( $result, $tel ) {<br> 
  $result = preg_match( '/^\(?\+?([0-9]{0})?\)?[-\. ]?(\d{10})$/', $tel );<br>
  return $result; <br>
}<br>
         
add_filter( 'wpcf7_is_tel', 'custom_filter_wpcf7_is_tel', 10, 2 );

Solution

First of all, [0-9]{0} looks like a typo as this pattern matches nothing, an empty string. You probably wanted to match an optional area code, three digits. So, use \d{3} if you meant that.

Next, to disallow same digits within those you match with \d{10}, you simply need to re-write it as (?!(\d)\1{9})\d{10}.

Bearing in mind what was said above, the solution is

function custom_filter_wpcf7_is_tel( $result, $tel ) {<br> 
  return preg_match('/^\(?\+?(?:\d{3})?\)?[-. ]?(?!(\d)\1{9})\d{10}$/', $tel );<br>
}

See the regex demo.



Answered By - Wiktor Stribiżew
Answer Checked By - Marie Seifert (PHPFixing Admin)
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

Sunday, February 6, 2022

[FIXED] Why is preg_match() with pattern "75.1." and subject "75.142.3.8" returning a match in PHP?

 February 06, 2022     php, preg-match, regex     No comments   

Issue

I am testing the matching of a function I am using to check IP ranges.

$a = "75.1.";
$b = "75.142.3.8";
$num = preg_match("/^$a/", $b, $matches);
echo "$a<br/>\n$b<br/>\n$num";

This is returning a $num of 1 for me. Why is this returning a match, despite "1." not matching "142"? Am I making some error with regular expressions? strstr($b, $a) and str_starts_with($b, $a) are both returning FALSE, as I was expecting.


Solution

. is a wild character, you'll need to escape it with a \

$a = "75\.1\.";


Answered By - Optimum
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg

Saturday, January 22, 2022

[FIXED] PHP: Check for characters in the Latin script plus spaces and numbers

 January 22, 2022     php, preg-match, regex     No comments   

Issue

I am new to regex and I have been going round and round on this problem.

PHP: Check alphabetic characters from any latin-based language? gives the brilliant regex to check for any characters in the Latin script, which is part of what I need.

^\p{Latin}+$

and provides a working example at https://regex101.com/r/I5b2mC/1

If I use the regex in PHP by using

echo preg_match('/^\p{Latin}+$/', $testString);

and $testString contains only Latin letters, the output will be 1. If there is any non-Latin letters, the output will be 0. Brilliant.

To add numbers in I tried ^\p{Latin}+[[:alnum:]]*$ but that allows any characters in the Latin script OR non-Latin letters and numbers (letters without accents — grave, acute, cedilla, umlaut etc.) as it is the equivalent to [a-zA-Z0-9].

If you add any numbers with characters in the Latin script, echo preg_match('/^\p{Latin}+[[:alnum:]]*$/', $testString); returns a 0. All numbers return a 0 too. This can be confirmed by editing the expression in https://regex101.com/r/I5b2mC/1

How do I edit the expression in echo preg_match('/^\p{Latin}+$/', $testString); to output a 1 if there are any characters in the Latin script, any numbers and/or spaces in $testString? For example, I wish for a 1 to be output if $testString is Café ßüs 459.


Solution

There are at least two things to change:

  • Add u flag to support chars other than ASCII (/^\p{Latin}+$/ => /^[\p{Latin}]+$/u)
  • Create a character class for letters, digits and whitespace patterns (/^\p{Latin}+$/u => ^[\p{Latin}]+$/u)
  • Then add the digit and whitespace patterns. If you need to support any Unicode digits, add \d. If you need to support only ASCII digits, add 0-9.

Thus, you can use

preg_match('/^[\p{Latin}\s0-9]+$/u', $testString) // ASCII only digits
preg_match('/^[\p{Latin}\s\d]+$/u', $testString)  // Any digits

Also, \s with u flag will match any Unicode whitespace chars.



Answered By - Wiktor Stribiżew
Read More
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Older Posts Home

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
All Comments
Atom
All Comments

Copyright © PHPFixing