Finding not-UTF8 characters

July 30, 2023

I found some characters that were not UTF8 in my webpages using Perl.

I typed:

perl -ne 'print if /[^[:ascii:]]/' cr.html

The W3C HTML validator noticed the invalid characters. The validator would not work when there were invalid characters.

The invalid characters were accents and the 'a' and 'e' combined in 'Julius Caesar'. The invalid characters were in quotes I took from 'wikiquote.org'.

Several suggested checks for invalid characters did not work for me because my version of Linux is 10 years old.

I found the successful command at 'linuxhandbook.com'. This looks a good website for learning about Linux. This site seems a bit like the old O'Reilly books such as 'Unix Power Tools'.

There are more tips at: bbingo.xyz/techtips/

Search This Blog

JavaScript tips

Finding not-UTF8 characters

Comments

Post a Comment

Popular posts from this blog

Running minifiers twice

Smooth scrolling

Minifying CSS