Content Matters A Web 2.0 Perspective September 19, 2008Posted by simarprit in : Content, Web 2.0, Web 2.0 Expo , add a comment
This session in Web Expo 2.0 on Content Matters is being coordinated by Liz Danzico.
Some of the facts as preamble to this discussion:
Content Drives Traffic
Users don’t read online
The content now comes in various unexpected ways
We aren’t writing, we are speaking in text, the internet looks like writing, but it is actually a conversation
Types/ Classification of content
- Navigation & Orientation content
- Labels and action
- Non textual
You need a content strategy, and you need a content strategist to look after it. If you have lots of information flowing in from the UGC route, you would be better off dedicating an “Ïnformation Architect” to integrate it. Style Sheets and Content Guidelines are two of the few essential documents which you must have operational in any content development project. To cover against plagiarism, it is important to keep the background documentation and source for your creation handy.
For a successful content strategy you need the team to bring in Passion, Editorial Responsibility and Monitoring Responsibility. Sites which have a major flow of user generated content need to continuously evaluate content on multiple parameters.
A well attended but poorly presented session, too few takeaways from a very illustrious panel.
SEO Best Practices – Content Issues August 24, 2008Posted by simarprit in : Blogging, Content, Internet, Search Engines, SEO, SES 2008, Spamming, Uncategorized, websites , add a comment
Content Duplication Issues and SEO Best Practices
Continuing on my series on SES 2008 San Jose, this White paper is again a hybrid of what was shared and what I have learnt over a decade on search engines.
If I give you 10 pages to read, you would scan through, start reading and if what you are reading is “new to you” may be you would read all ten of them at one go.
Now, if I give you 10 pages to read and when you scan through you find that “you’ve read it before” or “only one page is unique”, you may not even read my one unique page and trash all, worse, you would remember me as guy who tricked you by giving 10 pages to read when he had just information for one page. You would make a note “not so nice man to know.”
To me this is content duplication and so it is to search engines, so here we go.
- Search engines job is to satisfy the searcher, they want to grow and be seen as credible.
- Search engines have no favorites.
- They trust you unless you betray them, they work with a basic premise that what you are feeding them is your own and unique.
- So when you feed search engines anything they “Scan”, if you are “New” they may read whole of it.
- If you are not “New” they’ll trash you and “remember” you as “not a good site to know”.
So what are your choices, simple choice is to always provide “New content”, but this choice is expensive and restrictive to many, so what do these many do:
- Put same content on many pages on the same site as it is.
- Put same content on many pages of the same site with minor modifications, disguising it as new content.
- Put same content on many different sites under the same ownership.
- Put same content on many different sites with minor modifications but the sites are under the same ownership.
- Put same content on many different sites under many different ownerships, in many different servers, in many different data centers with or without minor modifications.
They all presume that they would be able to manipulate their way around, some do succeed, but issue is how hard are you working to do something which is wrong anyway. Search engines are becoming smarter by every passing day, they are scanning better, they are storing better and they are recalling better. The best case scenario is don’t duplicate your content and don’t manipulate content of others and put the same on your site, remember sooner or later you would be caught and become “Not a Good Site to Know”, and search engines would drop you out, as we all do.
This leaves us with the issue of what if someone does this to me. Yes, this is the issue!
So if you are original source of the content, your worry is – How does search engine know that I am the original? Search engines are working very hard to reach the original, in case they don;t make them aware.
Do what you will do with your assets: Protect them, be vigilant and act if someone breaches your copyrights. A related issue is when you syndicate your original content, I will cover this subsequently.
Some common inadvertent content duplication mistakes and issues:
- When spiders read your content four times: http://example.com, http://www.example.com, http://www.example.com/index.html or http://example.com/index.html. Most of the spiders know how to circumvent it, but it will help to put 301 re-directions in place and route everything to www.example.com
- When you change platform
- When you change URL structures, remove the old one and deploy 301 redirects
- When you create test folders, remove your test folders
- When you shift to a sub domain, clear the content permanently from your servers
- Disclaimers and privacy policies running across sites and copyright statements running across site. put non-crawlable JS functions or connect them centrally.
- Check your landing pages, if you have multiple landing pages make them unique
- Check your meta titles, and meta descriptions, they need to be unique
- Be careful on mirrored sites
- Content in multiple languages with common attributes or language strings is a no no
- Use exclude protocol in robots.txt where-ever you need to share the same content, within the same site or at different domains
- Check out and remove any hidden link.
- Use password protection where you need to carry duplicate content
- Permanent deletion of duplicate content is better than redirection
The above can form some of the best practices SEO’s can follow.
more to come…