How to Fix the Japanese Keyword Attack in One Day
Since we here at GoCrayons decided to focus on ElementorResources.com early this year, we totally left this website on its own for months, believing that it would do just fine.
I mean, look at our Google Search Console. We’re doing just fine before on our desired keywords even though we’re almost doing nothing.
How to Fix the Japanese Keyword Attack in One Day
Since we here at GoCrayons decided to focus on ElementorResources.com early this year, we totally left this website on its own for months, believing that it would do just fine.
I mean, look at our Google Search Console. We’re doing just fine before on our desired keywords even though we’re almost doing nothing.
However, Google Analytics told a different story.
Our case is commonly known as the Japanese Keyword Hack or Japanese Keyword Attack. It’s a pretty common issue across the Internet. And it can easily kill the SEO of a website in mere days, putting months or even years of hard work to waste.
Here’s our SERP of our site using the “site search” feature of Google.
You can easily mistake this as a “good effect” since you’d see an immediate increase in keywords count and user traffic. But it’s not good. It’s all fake traffic.
ElementorResources.com also fell victim to this, but I managed to salvage it in about a month through consistent research and trial and error.
Here are some images of ElementorResources.com’s web analytics data during the course of the attack:
Our session counts dropped significantly over the course of two weeks. We almost considered just giving up and restarting.
See that “plateau”? Those are thousands of Japanese keywords that ElemetorResources.com is ranking that’s not even 1% related to its content.
And since it’s ranking for Japanese keywords, there was also a significant increase in web traffic from Japan which we didn’t have before at ElementorResources.com.
Anyway, so how did we fix the Japanese keyword hack on ElementorResources.com and GoCrayons.com?
Well, it was kind of very easy and really obvious.
How to Fix the Japanese Keyword Attack Tutorial
When you think about it, the scope of the problem is really easy to comprehend. The issue is that you probably got thousands to hundreds of thoughts of pages ranking that actually had, which those pages rank for keywords that you don’t want to. So, there are two sides to the problem. One is on your end, and one is Google’s end. Click here https://wpengine.com/support/add-domain-in-user-portal/ to find the solution for many problems. In specific terms, the first problem that we’d solve is on the website itself, and the second is on Google’s index.
How you got a ton of Japanese pages ranking on Google almost instantly is that the hacker probably submitted a sitemap of your website using Google Ping. Like this:
Step 1:
If you’re using WordPress as your Content Management System, you might want to install a security plugin like Wordfence. If not, try other similar plugins. If you’re not using WordPress, I’m pretty sure there are other alternatives.
After installing, perform a scan.
In the case of GoCrayons.com, there were tons of infected files. Some were just core WordPress PHP files with malware snippet codes, some we’re actually malware scripts inside various plugins, with no discernible pattern.
Our first step is simple. Simply clean those infected core files and delete malware scripts. If you’re not familiar with the core codes of WordPress, you might want to assign that task to a much more experienced WordPress developer.
Since I’m already partly familiar with this, it took me just 15 minutes to wipe everything clean.
Malware code though looks like this:
Step 2:
Now, our second step is where it gets tricky. Our next step is obvious, we just reverse what we think the hacker did, which submitted a fake sitemap. We’re not gonna submit our own true sitemap, Google already has that. What we would do is submit a sitemap with all the fake pages included and ask Google to recrawl them again.
Since you’ve wiped your website clean already on Step 1, these pages should return a 404, and Google will delete them from their index databases. At least that’s how I think it works.
Our step then is first, to get all the pages currently index by Google. You have two options for this.
The first option is to use Googe Analytic’s Query Explorer, which I might cover on a separate tutorial.
The second option is much more obvious and direct: Google Search Console itself.
Step 3:
Go to your Google Search Console for your account and head to Performance. On the Pages tab, set the Rows Per Page to the maximum, which is 500.
If your website has more than 500 pages, you’d have to do the next step two or more times.
On the top right corner of the table, click the Export icon.
Clicking it should prompt a panel to show which gives you the option to download the export data as a Comma Separated Values file (.csv) or a Google Sheets file.
For this tutorial, I’ll choose Google Sheets for convenience reasons.
Step 4:
Now that you have your export file, let’s copy all the URLs. If you’re on Google Sheets, click the first cell, and scroll down to the very bottom of that column. While holding SHIFT, click the very last cell vertically proportional to the cell you initially clicked.
Save it to a TXT file or just anywhere for later use. I’ve pasted mine on Sublime Text 3 and saved it in a DATA format.
Now, the next step entirely depends on your programming background, but the idea should be simple and the same.
Step 5:
Since I’m a Game Developer as well, I’ll use the open-source game engine called Godot to generate our XML sitemap.
Open Godot and create a new Node.
Create a new script on that new node by clicking the “script” icon beside the node on the inspector.
I tested if the array is valid.
These are the main functions that would allow us to turn the array of URLs into a single text and store it as an XML file.
func write(file_name, content = "", def = mdef): var file = File.new() file.open(def + file_name + ".xml", file.WRITE) file.store_string(content) file.close() func implode(_array, _delimiter = "&&", _exception = ""): var array = Array(_array) var string = "" if array.size() > 0: for each in array: each = str(each) if each != _exception: if string == "": string = each else: string = string + _delimiter + each return string
Now, we need an XML sitemap template for generating the sitemap. I’ll be using Yoast SEO’s template since it looks better but feels free to use the default for convenience’s sake.
var intro = ‘<!–?xml version=”1.0″ encoding=”UTF-8″?–><!–?xml-stylesheet type=”text/xsl” href=”//[Your Domain Here]/wp-content/plugins/wordpress-seo/css/main-sitemap.xsl”?–>’
[Your Domain Here]
is to be replaced with your website. var end = '' func get_parsed_urls(_array): var array = [] for each in _array: array.append('' + each + '2019-10-03T14:21:00+00:00\n') return array
Now that all the preparations are done, we just need to put one last code on the initialization function of GdScript.
var mdef = "res://" func _ready(): # Called when the node enters the scene tree for the first time. urls = get_parsed_urls(urls) var m_string = implode(urls, "") var f_string = intro + "\n" + m_string + "" + end write("deprecated_sitemap", f_string)
Click the Play or Play Scene tree buttons to generate the sitemap.
Now go to the project’s folder and your sitemap should be there.
Step 6 (optional):
Make sure you replace any potential ampersand (&) character with its HTML escaped version.
Step 7:
Now, upload your sitemap on your website’s home directory, such as public_html.
Step 8:
Validate our custom sitemap first so we won’t face any issue when we pinged and submitted it to Google using Google Search Console.
Step 9:
Our last step is to submit the custom site map to Google.
Make sure you ping Google too just to make sure and (hopefully) make the re-indexing faster.
Validate your “Security Issues” warning on your Google Search Console and tell Google you’ve already removed the infected pages and scripts on your website.
The Results
12 hours later, I immediately received an email that they had successfully validated the request.
If you have any questions, let me know in the comments section! You might also want to contact us and we can fix your website.