Skip to content

Replacing with SED and regexes

Let’s say you have a html file called file.html and you want to replace “.jpg” to “.png” but only when in a href of anchor elements.

Example of input:
[html]<a href="alice.jpg">alice<a>
<a href="bob.jpg">bob<a>
something.jpg
href.jpg
<a href="example.com">alice.jpg</a>
<img src="href.jpg">
[/html]

Desired output:
[html]<a href="alice.png">alice<a>
<a href="bob.png">bob<a>
something.jpg
href.jpg
<a href="example.com">alice.jpg</a>
<img src="href.jpg">
[/html]

Notice that only the first two references to “.jpg” were changed to “.png”, the ondes in the href of the anchor.

You can use sed with regexes to achieve this.
[bash]
sed -i -E ‘s/(<a href=".*).jpg(")/\1.png\2/’ file.html
[/bash]

Where:

  • -i for editing the files in-place
  • -E to use a script
  • s// substitute
  • (<a href=”.*) group 1, the string ‘<a href=”‘ followed of any character zero or more times
  • .jpg the .jpg we want to replace
  • (“) group 2, only “
  • \1.png\2 substitute with the same group 1 then .png then the group 2.
Published inenglish

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *