David Nash

Wordpress Guru Sydney

Delete links across multiple HTML files in bash

Posted on December 5, 2011

Or any type of file, really. I needed to quickly remove links from an old website that was using flat HTML files. In my linux command line, I found I could do:

perl -pi -e 's/SEARCH/REPLACE/g' *.html

To replace all instances of SEARCH with REPLACE in *.html.

Except I needed to do a fair bit of escaping, because HTML is full of characters that mean something else on the command line.

So let’s say the string I needed to remove was:

<a title="Search Engine Optimisation" href="http://superspammyseocompany.com/" target="_self"><span>Search Engine Optimisation</span></a> by <a title="Super Spammy SEO Company" href="http://superspammyseocompany.com/" target="_self">Super Spammy SEO Company</a>

I copy + pasted this into vim, and then every time these characters occur:

< , >, / and ”

I put a \ in front of each of these, which gave me:

\<a title=\"Search Engine Optimisation\" href=\"http:\/\/superspammyseocompany.com\/\" target=\"_self\"\>\<span\>Search Engine Optimisation\<\/span\>\<\/a\> by \<a title=\"Super Spammy SEO Company\" href=\"http:\/\/superspammyseocompany.com\/\" target=\"_self\">Super Spammy SEO Company\<\/a\>

Which was a bit of work, but still much more fun than manually removing the link from each file.

Note that these characters do not need to be escaped with a backslash:

= (equals), . (dot), and  _ (underscore)

So my final command was:

perl -pi -e 's\\<a title=\"Search Engine Optimisation\" href=\"http:\/\/superspammyseocompany.com\/\" target=\"_self\"\>\<span\>Search Engine Optimisation\<\/span\>\<\/a\> by \<a title=\"Super Spammy SEO Company\" href=\"http:\/\/superspammyseocompany.com\/\" target=\"_self\">Super Spammy SEO Company\<\/a\>//' *.html

I’d already initialised a git repository and committed the files so I could easily restore the files in case of a mistake. A quick look through the links showed it all worked perfectly, and it saved me so much time I thought I’d write this post about it.

Bonus: I outputted all the changed files to list.html, which had one filename per line, like:

./file1.html
./file2.html
./file3.html

Here’s the vim command to turn them all into links, for easy human checking:

:%s/^\(.*\)$/<a href="\1">\1\<\/a\>\<\/br\>

HTML syntax highlighting for Silverstripe .ss template files in vim

Posted on August 4, 2009

By default vim opens .ss files with some other file format syntax highlighting.

To enable HTML (actually XHTML) syntax highlighting on your .SS Silverstripe template files, create (or edit) your ~/.vim/filetype.vim file. Then enter this:

au BufNewFile,BufRead *.ss      setf xhtml

Then open a .ss file and it’ll give you nice HTML syntax highlighting. And because it’s in your home directory, it’ll keep working even after you upgrade vim.

Strip <span> tags from HTML in vim

Posted on April 6, 2009

When upgrading a website you might see source code like this:

<span style="font-family: Arial; color: #0000ff; font-size: small;">Some text goes here</span>

You’re using CSS now and all those <span> tags are ruining it. In gvim, do this search and substitute:

%s/<span.\{-}>//g

Then to get rid of the </span> tags, do this:

:%s/<\/span>//g

Strip ^M from file in vim

Posted on February 20, 2009

If there are ^M at the end of every line when you view them in vim, you can do this:
:%s/^M//g
To get the “^M” bit, hit ctrl-v and then ctrl-m.

Assorted Handy vim Commands, Part 1

Posted on January 28, 2009

To reverse the order of lines, eg 1-5
: 1,5 g/^/m0
For example,

one
two
three
four
five

becomes

five
four
three
two
one

To remove blank lines
: %g/^$/d

Delete all lines that don’t contain “string”
: %v/string/d

Turn all links in an HTML file into ‘#’

Posted on January 24, 2009

Replace all in the current file instances of:

<a href=”link_to_somewhere.html”>link</a>

with:

<a href=”#”>link</a>

:%s/\(a href="\)\(\S\+\)\("\)/\1#\3/g

This is great if you’ve got a big HTML file that you want to demo to a client where you don’t want the links to do anything when clicked.

You could also use javascript to do the same thing – just “return false” for all <a> elements on the page.

vim: Quickly assign POST variables in PHP

Posted on January 22, 2009

You’ve got a web form with lots of fields, and you want to POST them to a PHP script. Open vim, and list the INPUT tag’s NAME attributes, one per line.

<?php
firstname
lastname
address1
address2
state
postcode
?>

Now with some search-and-replace magic we can save ourself a lot of boring typing. Hit ‘escape’ to get out of insert mode, type a colon (“:”) and copy-paste this:

%s/^\(\S\+\)/\$\1 = \$_POST\[\'\1\'\];

I won’t bother explaining it unless someone asks. But what you should end up with is this:

<?php
$firstname = $_POST['firstname'];
$lastname = $_POST['lastname'];
$address1 = $_POST['address1'];
$address2 = $_POST['address2'];
$state = $_POST['state'];
$postcode = $_POST['postcode'];
?>

Perhaps not worth it when you have 6 variables like this example; but quite handy when you have 60 variables!