To better understand regular expressions, I recommend learning the connection between REs and
non-deterministic finite state machines (aka nondeterministic finite automata, or NFA). The subject is illuminating, though it should be noted that Perl-compatible regular expressions (PCREs, which is what the preg_* functions use) are more expressive and powerful than classic REs.
I want that when you input a invalid character that it doesn't save your message:
In that case, you don't want to negate the result of the preg_match, to start with. What will happen is that `preg_match` will do its best to find a match. As long as there's a valid character, the match would succeed, causing the invalid test to fail. The only way your test would succeed is if the input contains no valid characters.
Note: you've changed what punctuation is valid. In you first post, you specified
[!@?.:$=&]. In your last, you're looking for
[!@$_=-+]. I'm going to assume you want the union of these classes, plus a few others.
Fortunately, all you need is a single character class. You can either have a character class for valid characters, and have that class match the entire string, or have a character class for invalid characters and have that match at least once. Defining the valid characters as a class is easy, and negating a character class is easier: just place a carat ("^") immediately after the opening square bracket. That is,
[^0-9] is the negation of
[0-9]. The preceding classes are equivalent to "
\D" and "
\d", respectively. Similarly, "
\w" and "
[_a-zA-Z0-9]" are the opposite of "
\W" and "
[^_a-zA-Z0-9]".
Note that if you put two opposite special characters (i.e. those that begin with a backslash) in a character class, the result will match every character, because every character will match either one special character or the opposite special character. That is, classes including "
\w\W", "
\s\S", "
\d\D" &c are equivalent to "
.", which matches every character.
What you want, if you haven't guessed, is:
PHP:
if (preg_match('/[^-\s\w.!?,:;@$=&]/', $msg)) {
// message contains invalid characters; print error message and process no further.
...
} else {
// message contains only valid characters; process it.
}
Until you're familiar with backslash escaping in strings, you should probably escape the '\' in the above RE as a reminder, even thought it's not necessary. If you used double quotes, the pattern would need to be "
/[^-\\s\\w.!?,:;@\$=&]/".
Earlier I mentioned you could checking that all the characters are valid (rather than that at least one is invalid); the pattern for this is
/^[-\s\w!@?.:$=&]*$/ (if you also want to check that the message has at least one character, change the "*" to a "+"). However, looking for an invalid character is the simpler pattern.
Note all the valid characters are inside the square brackets because they all are supposed to belong to the same character class; if you place them outside the brackets, you are asking them to match subsequent characters. Also, some of the characters (?, . and +, in your case) will have special meanings outside of square brackets. For example,
'/[a-zA-Z0-9]!@?.$_=-+\s/' will match:
- 'a!^$_=- '
- 'Q!@($_=- '
- '9!@.$_=----- '
But not:
- 'a!^$_= ' (no "-")
- 'Q!@@($_=- ' (too many "@")
- '9!@?.$_=----- ' (has a "?")
- 'A!@?{$_=-+ ' (has a "?" and "+")
Furthermore, if you use double quotes and don't escape the "$", then "$_" will be
replaced by the value of the
$_ variable, which is most likely empty.