Sunday, October 12th, 2008
This is a follow up post to my previous posts about my friend’s Google ranking drop. As you may remember, his Google ranking was restored a few weeks after he blocked the proxy website from copying his entire website and submitted a Google reinclusion request. As you may have guessed, he was quite thrilled to see his SERP ranking shoot up again.
Well, as luck would have it, I received a phone call last night from my friend telling me that his website was bombing again. I Googled his favorite keywords and they seemed to rank fine over at my end, but he explained that he traffic stats from Google was flat. They nosedived a day or two ago. I chalked up the results I was getting to Google adjusting the results.
This new twist got me thinking. What in the world could be making this website’s ranking bounce around like this? Looking back, the proxy website may not have been 100% at fault. There has to be something else.
I began doing a little research and learned about few things about duplicate content. The reason I looked at that particular area is because there is absolutely nothing else I can find wrong with this website. Duplicate content seems to be a rather popular culprit.
I came across a pretty well laid out website called “Google Rankings Diagnostics” that describes a whole heck of a lot of issues you might be having with your website. This website validated what I pretty much already knew…that if you have multiple URLs (on a domain) with the same exact content, Google has trouble figuring out which page is the original and may throw all of them out.
I took a very close look at my friend’s website. Again, I took a unique line of text from his homepage and searched for it in Google (inside quotes). A funny thing happened. I saw the homepage result, but there were a few extra results as well, all on his domain. There were about 5 extra pages in total.
Now, some of these extra results have been there for years, so I don’t attribute the issue to those pages being duplicate content. What struck me was one of the extra pages.
A few months ago, my friend moved one of his pages. He put a 301 redirect in his .htaccess file, which was the correct thing to do. So now, the old directory where the page was held forwarded to a new page. It looked something like this:
Redirect 301 /olddirectory/ http://www.hiswebsite.com/newpage.php
The redirect worked fine, but here is what that extra page in the search results looked like:
http://www.hiswebsite.com/newpage.phpoldpage.php
Guess what page was showing at that URL…yup, the homepage. The dynamic nature of his website sends unknown page results like this to the homepage. This was a fluke. My friend forgot that there were pages inside the old directory he redirected to the new page. Every old page in that old directory was tacked on to the new page, like you see above. To make matters worse, there were a bunch of links from other websites pointing to the old pages in the old directory.
I am not sure if this would cause the ranking drops that he is experiencing, but the timing certainly lines up with when the issue began. It is also certainly considered duplicate content.
So, here is what I did to deal with the issue this time. I deleted the redirects in the .htaccess file and blocked the URLs of all those extra results in the robots.txt file. Hopefully, this will tell Google to not spider or index those pages and it will also tell Google that those links into the site are dead.
Now, we have to wait. I am not going to submit another reinclusion request to Google because I want to see if the ranking returns naturally. If it does, this was the problem for sure.
Related posts
Thursday, September 25th, 2008
This is a follow up post to my “Sudden Drop In Google Ranking post.
This morning, I checked the ranking of the website in question. To my surprise, the site had again ranked number 4 in the Google Search Engine Results. This was most definitely good news. In fact, all key phrases now ranked on page one of the Google SERPs.
I can only hope this persists. So, what did we do? Here is a short list:
- Noticed the website had dropped in Google ranking.
- Took a unique phrase from the website homepage and searched Google using quotes, “like this.”
- Found a direct copy of the website and discovered it had been “Proxy Hijacked.”
- Found IP address of website that Proxy Hijacked our website and blocked it using the .htaccess file.
- Submitted a “Reconsideration Request” to Google.
After about a week and a half, our website had regained its ranking in Google.
I read a long article about Proxy Hijacking and it mentioned that Google had fixed the problem. If this was the case with my friend’s website, this certainly isn’t true. While I can not be totally sure Proxy Hijacking caused this case of Google ranking loss, the facts seem to lead down this path.
What is my advice to you? Check either Google or Copyscape once a month to see if someone has taken text or Proxy Hijacked your website.
Related posts
Saturday, September 20th, 2008
As I wrote in a prvious post, duplicate content on your own website can come in the form of “www.mysite.com/†vs. “www.mysite.com/index.html.†The search engines see this same page as two different ones, but with identical content. As I also mentioned, most search engines are smart enough to figure out that these two pages are the same one, but still, they do share Pagerank.
What to do? That’s easy too. Just open up your .htaccess again and type in the following code:
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.mysite.com/ [R=301,L]
You can do this with other pages that have the same problem as well.
Related posts
Wednesday, September 17th, 2008
Ok, this is a pretty simple thing to do and it has some important benefits.
Have you ever visited a website or a web page only to find that annoying “Not Found” error message? If so, what did you do? You probably got ticked off, hit the back button and visited another website. Can you imagine someone coming across a “Not Found” error page on your website? Well, if you don’t have a custom 404 “File Not Found” page set up on your website, that might just be happening.
Here is what you need to do to fix this problem and keep your visitors on your website.
The first thing is to create a web page with some sort of message on it. Something like, “Whoops, looks like the page you are looking for isn’t here. Please click this link to visit our home page or our search page…” You get the idea. You can save the page as “404.php” or something similar and upload it to the root of your web server.
Oh, I forgot to mention this. In order to do what I am suggesting here, you need to be running an Apache web server and your web host has to allow changes to your .htaccess file. I am sure there are other ways to create a custom 404 File Not Found error page and get it up and running, but I am only talking about one way here.
Now, open up your .htaccess file and place this code into it somewhere. I like to place it right on top:
ErrorDocument 404 /404.php
I am using .php extensions for this stuff just because of habit and preference. You can use .html or whatever you wish.
Well, that’s basically it. You can now save your .htaccess file and upload it to the server and go see if it worked. Try typing in some page that you know isn’t there. If it works, please read my previous post about “How To Check Your Web Page HTTP Headers & Response Codes” for some important information.
Good luck.
Related posts
Tuesday, September 16th, 2008
There may be cases when you would like to see what your webpage HTTP headers look like. Why? Well, because they are kind of important. As Wikipedia states, the HTTP headers define what the returned data looks like.
Still you ask, “Why in the world do I care about that?” Ok, I’ll keep going. The main reason I look at the HTTP headers is to find out what the HTTP status code is. The reason the status code is important to me is because this is the code the search engines use for a multitude of things.
Let me give you a little example, and this related to my previous post regarding the sudden drop in Google rankings. As I was doing research into what the problem may be for this particular website, I came across an issue where someone had recently put custom “404 Not Found” error pages up on some of their websites. Everyone knows that custom “404 Not Found” error pages are cool, but what some people don’t know is that if those 404 error pages show a “200 OK” (successful HTTP requests) code, the site may be in big trouble, SEO-wise. The reason for this is because there are going to be many “404 Not Found” error pages on a dynamic website. If you have your custom “404 Not Found” error page showing a “200 OK” response code, the search engines will think that all the instances of this page are duplicate. You know as well as I do, that spells trouble.
What’s worse is if you set your homepage as your “404 Not Found” page. Your homepage is going to return a response code of “200 OK.” That’s not good, because now you have multiple instances of your homepage…all duplicate content.
It’s my opinion that the search engines are smart enough to figure this out. The page (such as your homepage) with the highest Pagerank will prevail. Still, I have some websites that I am working on that have multiple instances of the homepage and they all have Pagerank, which isn’t good, because the duplicates are taking the Pagerank from the real page. Now, again, that’s my opinion.
Here are two tips:
- How to check your HTTP headers – visit this website or just Google “Website header check”
- How to set a particular page as your “404 Not Found” error page in your .htaccess file – Just place this code in the file: “ErrorDocument 404 /404.php” without the quotes. The 404.php file is the actual error page in this case.
Related posts
Monday, September 15th, 2008
A colleague of mine gave me a call yesterday morning with some rather upsetting news. Apparently, one of his websites took a plunge in its Google Ranking. He wanted to know what could cause such a sudden drop in Google Ranking like this.
I really didn’t have an answer for him. The site has been alive (but in the Google Sandbox) for about four years. It always struck me as strange that the site was sandboxed for such a long time. It literally took four years to come from page 30 in the Google Rankings to page one. Suddenly, last month, the website appeared on page one for its most prime keywords. Now, this wasn’t a gradual change in ranking, it was a huge jump.
The website doesn’t appear to have anything wrong with it. I gave the entire site a once over. I checked the typical meta information and linking structure and found nothing wrong. The website really hasn’t changed in months, besides the content, so it led me to believe there are outside forces at work.
The question I have is, “Why would a website, with a poor ranking, suddenly rank number five on Google one month and then fall back to page 24 the next month?”
I tried to get some information out of my friend. The only thing major he did in the past few weeks is to add a custom 404 or Not Found error message. I checked the 404 page to make sure the headers were correct and not giving 200 results. They error 404 pages were fine.
Then, I went over to Copyscape to see if there were any copies of his homepage. I have heard this can cause a sudden drop in Google rankings. I did find a proxy website that had almost his entire website cached and was trying to pull it off as its own. This wasn’t a typical proxy server trying to speed up the internet. This was something else…more like an intercept proxy.
I looked in the log files to find the IP address of this proxy website. I found it and blocked the IP address in his .htaccess file and then checked the proxy website again. His website no longer showed and was replaced by the Red Hat error page instead.
We will have to give this a few weeks to see if anything changes. I am now thinking that is something does change (for the better), this may have been what was causing the extremely long Google Sandbox issue as well.
If you have any further suggestions, please let me know via comment.
Related posts
Monday, September 15th, 2008
Today has been an interesting day. We have been taking a look at our websites and searching for duplicate content using Copyscape. After today’s findings, we might just go with Copyscape’s premium service.
Now, let me just tell you that duplicate content is everywhere. Actually, someone has probably written this sentence a million times. What we were searching for today was blatant and far reaching content theft. We found a few instances of one of our homepages and general website idea taken for someone else’s use as well as many instances of interior pages taken. Needless to say, we made screen copies of these cases and sent them to our attorney’s office. These are serious and can’t be ignored.
I would like to talk about two things you can do to help out a more subtle form of duplicate content, on your own website.
The first form of duplicate content on your own website is in the form of www vs. non-www. If you go to your website and type in “www.mysite.com” and then type in “mysite.com,” you may see the same page appear. In the search engine’s eyes, these are two copies of the same page. How do you fix this? It’s easy. Just open up your .htaccess file and type in the following code:
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.mysite\.com
RewriteRule ^(.*)$ http://www.mysite.com/$1 [R=permanent,L]
When someone types in “mysite.com” to visit your website, they will automatically be forwarded to “www.mysite.com.” The search engines will be forwarded as well.
Another form of duplicate content on your own website comes in the form of “www.mysite.com/” vs. “www.mysite.com/index.html.” The search engines see this same page as two different ones. What to do? That’s easy too. Just open up your .htaccess again and type in the following code:
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.mysite.com/ [R=301,L]
When someone either types in “www.mysite.com/index.html” or follows a link like that to your website, they will be automatically be forwarded to “www.mysite.com.”
Now, here is the disclaimer. I used this on my server setup and it worked. Please check with your own hosting company to see if something similar will work for your too.
Related posts
Saturday, January 12th, 2008
Again, I read the writing on the wall.
One of the companies that created some of my website applications recently put a web poll in their forum. They asked how we users feel about having the next upgrade (and every one thereafter) only work with PHP5 and up on our web servers. I didn’t know what to think. I believe I was running PHP 4.3.8 on a few of the servers and some other version of 4 on the other servers. I didn’t know how all of my 30+ websites would respond to a PHP upgrade.
To find out, I did a little configuring in the .htaccess files on two of my websites. By doing this, I automatically brought the version up to PHP 5.2.4. If things blew up, I could easily bring the version back down to PHP4 by taking that line of code out of the .htaccess file.
Well, I tried it out and everything worked fine, except for two little files that an outsourced developer created for me. I contacted him and he fixed the issue rather quickly. Since I knew things would be ok, I went ahead and had one of my other servers upgraded to PHP 5.2.5.1. After the upgrade, everything worked fine on all my websites.
So, what is the difference between PHP4 and PHP5? You can click here to find out.
Related posts