Remove Bulk Spam Comments from WordPress Export
Recently I was responsible for migrating a WordPress site from a 3rd party hosting service to my personal LEMP stack. The issue with this migration was that there were around 17,000 comments sitting untouched in the comments section creating an excessive amount of bloat in the .XML export, ~51mbs to be precise.
Thankfully I found this handy post from Neil R suggesting SED as a possible solution. After modifying his solution slightly for OSX, It now removes all comments from the WordPress export via a single command in Terminal.
Shell Command
$ sed '/<wp:comment>/,/<\/wp:comment>/d' export.xml > export.clean.xml
This will go through export.xml
removing each instance of <wp:comment></wp:comment>
it comes across. Afterwards then saves the output in export.clean.xml
.
This post assumes that the user is aware of how to use Mac Terminal to navigate between folders and edit specific files.