Automation

Full Post

Remove Bulk Spam Comments from WordPress Export

Recently I was responsible for migrating a WordPress site from a 3rd party hosting service to my personal LEMP stack.  The issue with this migration was that there were around 17,000 comments sitting untouched in the comments section creating an excessive amount of bloat in the .XML export, ~51mbs to be precise.

Thankfully I found this handy post from Neil R suggesting SED as a possible solution.  After modifying his solution slightly for OSX, It now removes all comments from the WordPress export via a single command in Terminal.

Shell Command

$ sed '/<wp:comment>/,/<\/wp:comment>/d' export.xml > export.clean.xml

This will go through export.xml removing each instance of <wp:comment></wp:comment> it comes across. Afterwards then saves the output in export.clean.xml.

This post assumes that the user is aware of how to use Mac Terminal to navigate between folders and edit specific files.

Additional Links on SED