Publisher growth tactics for election season | WEBINAR
After going through this guide, you should be able to understand how existing news articles are modified and created using page layout designs that improve Google’s ability to crawl and comprehend the page’s content.
Video Duration
15:35
Answer Quiz
Take current module quiz
Materials
Ready-to-use templates
Resources
Reports and Resources
0 of 9 Questions completed
Questions:
You have already completed the quiz before. Hence you can not start it again.
Quiz is loading…
You must sign in or sign up to start the quiz.
You must first complete the following:
0 of 9 Questions answered correctly
Your time:
Time has elapsed
You have reached 0 of 0 point(s), (0)
Earned Point(s): 0 of 0, (0)
0 Essay(s) Pending (Possible Point(s): 0)
What should you avoid when creating a website?
What should you do to make sure search engines understand the content on your page?
What should you do to make sure each news story is displayed on a unique URL?
What should you do to prevent inaccurate article titles?
What should you avoid in articles to make them easier for web crawlers to read?
What is the basic markup language for websites?
What should you do to make sure your page is easy for web crawlers to read and understand?
What is the maximum amount of page data Googlebot can download in the first crawl?
What type of redirect should be used for non-permanent redirects?
2.1.1 What Is Design and Layout?
The design and layout of your site determines how it appears to the end user. This is important because Google is ultimately driven by a user-first philosophy. Web pages that satisfy the user’s need first, fastest and in the simplest possible way are rewarded with higher SERP rankings.
Your site’s design and layout also determines how easily web crawlers, such as Google bot, crawl and index it. A simple and optimized design and layout means fast and easy crawling, which in turn translates into better rankings.
So what stops publishers from implementing design and layout best practices? Most often, publishers are confronted by these pain points:
To answer this question, we performed a simple Google search entering the keywords “Charlie Puth News” into the search bar.
Here’s what the search results turned up:
At number two on the Top Stories SERP ranking, right above a story by NME, is the Daily Illini’s article on Charlie Puth’s latest release.
The fact that the Daily Illini, a university student newspaper, outranks the world’s biggest standalone music website, raises some important questions
How is a student newspaper in a town of some 40,000 inhabitants in the American midwest outranking the largest music news website on earth? Intrigued, we decide to dig around a little more.
First we checked out NME’s page on Charlie Puth.
Right off the bat, the first thing we notice is the pop-up video trying to load on the bottom right-hand corner of the screen. It’s obviously not doing a very good job of loading. The buffering video also hides part of the news headline and its body.
Next up, we notice that the initial viewport is occupied mostly by stuff that isn’t relevant to the news story. There’s a big banner ad covering about half of the page, and of course there’s the video.
In fact, scrolling down the page, we encounter more videos, more big, rich images, more pop-up ads and a lot of hyperlinks. Given how media-rich the page is, it unsurprisingly took quite a while to load.
We next inspected the Daily Illini and here’s what we found.
The page is neat, clean and uncluttered. It has its share of ads and a big Donate button at the top, but there are no videos or pop-ups covering the viewport or obstructing the news headline. We can see the headline right away, and it is very likely that the same applies to Google’s web crawler.
On the whole, the page is light, minimalistic and lightning fast to load.
We decided to peek under the hood a little more at the underlying code. By right-clicking on the page and selecting View Source (while using Chrome), we can see the page’s code.
This is what we saw for the NME page:
Two things grabbed our attention here:
This is not the best thing for a page for two reasons:
When we looked at the code for the Daily Illini page on the other hand, we saw this:
This is very simple HTML code. Also, there are no scripts running within the <head> section.
How does this all add up to the Daily Illini outranking NME?
There are probably a number of factors at work here, and one among them is design and layout. The Daily Illini page deploys certain design and layout techniques that even small publishers can easily replicate to boost their overall SEO strategy.
These include using clean, simple HTML code, avoiding scripts in the header section, keeping the page light and fast to load, and not relying too heavily on pop-ups and interstitial ads.
The guide below digs into each one of these in detail, while explaining several other techniques you can implement to significantly improve your SERP rankings.
Semantics relates to the meaning of words. Semantic HTML tags are those that clearly define their meanings both to the reader and a web crawler.
For example, when we use a tag like <header>, we know what it contains — information about the header.
Similarly <h1> is a semantic tag that tells Googlebot that what follows is the most important heading in the article.
By contrast, when we use a tag like <div>, its meaning is not immediately apparent. In HTML <div> stands for division, and all it implies is that a new code section has begun, without necessarily revealing any information about the contents of this section.
Web crawlers like Googlebot are built using artificial intelligence and machine learning algorithms that attempt to simulate the functioning of the human brain. This means that they make sense of text in much the same way as the human brain does.
HTML code that is easy for humans to understand should also be easy for Google’s web crawler to understand.
As an example, consider the two pieces of HTML code below:
Source: https://www.pluralsight.com/guides/semantic-html
This page uses the <div> tag for everything, from header to the main content to the footer. It’s not immediately apparent by reading the tags what its content is.
By contrast, the page below uses semantic markup. The header is placed within the <header> tag, the footer within the <footer> tag, and the main body of the article goes within the <main> tag.
Source: https://www.pluralsight.com/guides/semantic-html
Since this is easy for Googlebot to read and understand, this page has a better chance of ranking higher than the previous one, all other things being equal.
To view whether your page uses semantic markup, simply right click on the page if you’re using Google Chrome, and click on Inspect. You will be able to see the HTML source code of the page. Common semantic elements include <author>, <video>, <article>, <form>, <header>, etc.
We now know what Semantic Markup is and why it’s important. But how do we use it to improve SEO?
It’s simple — always use semantic markup to mark out important information about your article’s design and layout. This includes the following article information:
Ensure that your page’s layout is well-ordered to improve crawling
You’re designing your site to be read by both humans and web crawlers and, as such, your design and layout should reflect this fact.
Below are a few tips to help you achieve measurable outcomes for your website.
You can use HTML, CSS, Javascript or any other frontend language to create rich and interactive pages. However, remember that the more advanced the language, the greater its complexity, and the greater chances are that a web crawler may find it hard to read, interpret and compile.
Anything coded in HTML may not be the prettiest to look at, but it will both load faster and be more optimized for search engines for the simple reason that search engines can read and understand it faster.
Think of plain HTML as the bare bones skeleton of your web page. You can add CSS and Javascript to flesh it out and make it look aesthetically pleasing and dynamic, but it would be better to keep the most important content within the skeleton rather than place it in the flesh.
So how do we implement plain HTML? One simple way of doing it is to place the main body of your content within <article> HTML tags.
This way, when web crawlers encounter the <article> tag, they know immediately that what follows is the most important content on your page — the news article. This helps the search engine understand that the content wrapped within this tag needs to be assigned greater weight.
Plain HTML’s <article> tag is a semantic marker that looks like this:
Source: https://en.wikipedia.org/wiki/Article_element
The next obvious question? If I’m using a CMS like WordPress, where do I insert these tags?
How to Do This: If you’re building a custom website using HTML, then you can check the source code to ensure it’s using plain HTML, especially in critical areas. We’d advise speaking in more detail with a developer to ensure you don’t hamstring functionality by accident.
If you’re using WordPress, then refer to this guide. You may also find this guide on how to insert HTML into posts and pages a useful reference source.
These instructions are for WordPress as WordPress remains the most popular CMS for publishers. If you’re using a different CMS such as Wix, please consult the support or documentation page for your CMS.
If you have access to a team of web developers, it is best to have them do it as editing HTML code can be time consuming.
Test to ensure that your content appears correctly in all browsers, devices and sizes. This one is more obvious but often overlooked. If your content does not appear the way you want it across all browsers and devices, it will affect user experience, and in the long run, your SERP rankings.
How to Do This: To test content across platforms you need to open your page in different devices and in different browsers to see how it is being rendered.
At a minimum, you should test for the following:
HTML markups help highlight the different elements of your page. Structured data helps search engines read what’s inside the different elements of your page and better understand its content.
Structured data is simply a series of instructions written in a simple language, such as JSON-LD, that can be inserted within the existing HTML code of your webpage. Think of it like a meta description, but for individual pieces of content on your page.
In the example below, structured data helps Google identify five attributes about a dbpedia page on John Lennon:
Source: https://json-ld.org/
As you can see, the code is in simple language that is easy for both a human reader and a web crawler to understand.
Here’s another example that shows how structured data can fit right into your web page’s existing HTML code. The structured data instructions are highlighted in green.
Source: developers.google.com
In this example, structured data tells Googlebot that this is a recipe page about coffee cake form somebody called Mary Stone.
Using structured data in your website’s layout delivers measurable outcomes. For instance, using structured data can increase a website’s click-through rate (CTR) by up to 30%.
Using structured data also helps your page rank better on Google’s carousels, featured snippets, videos and knowledge panel features.
For Google News SEO, it’s important to include the following elements when creating structured data to provide additional value:
How to Do This: You can add structured data/schema to your content either manually or by using a plugin for your particular CMS.
All the elements of your news article should be arranged in a specific order to allow faster and easier crawling. The order is as follows:
Page experience is a measure of how users experience your page. Google has a set of parameters to quantify page experience. We’ve dedicated an entire module to page experience factors, so we’ll only briefly look at each here.
How to Do This: You can test page experience both manually and by using plugins or third-party apps. For instance, Page Speed Insights is a handy tool that helps you analyze your site’s performance based on CWV and other parameters and assigns a score based on its analysis. It also tests for both mobile and desktop responsiveness.
News publishers should not publish multiple news articles under the same URL. This will obstruct Google from indexing them. Each news article should have its own unique URL.
Furthermore, these URLs should be permanent. Which means that the same news story should be displayed on a particular URL. If the news story associated with a particular URL changes frequently, Google will not be able to crawl and index it. Publishers should however, update the news story as often as is needed.
If redirects need to be used for news articles they should be implemented according to the following best practices:
While the action items listed in this section aren’t as important as those above, we still recommend implementing as many of these as possible once the mission critical points listed above have been addressed.
The <head> element of a page contains important information about the page that is not actually displayed on the page. What it includes is metadata that helps Googlebot identify the contents of the page and classify it.
As a rule, the <head> element should include only the most important tags and nothing else, so a post can be crawled and rendered properly. These include:
Anything else contained within the <head> element is likely to confuse web crawlers.
For instance, it is common for novices to confuse the title tag with <h1> and place the latter within the <head> element. As previously explained, the <head> element can only contain metadata that is not displayed on the page.
Even though title and <h1> should contain essentially the same information, the former is metadata meant for web crawlers and to be displayed within the SERP results and browser tab, while the latter is information to be displayed on the page.
The code below shows how to place title within the <head> element.
Source: developer.mozilla.org
Using page elements that make it easy to scan content and make navigation a frictionless experience for the user also impacts SEO.
An easy to navigate page will contain these elements:
Unless you are a seasoned web developer, it is best to consult with an expert on the best way to implement a user-friendly UX.
Google wants publishers to display ads without disrupting the user’s experience. For this reason it may penalize websites that display too many intrusive ads. While user experience is a subjective metric, Google has certain guidelines and best practices when it comes to ads.
Some of them relate to:
For more on ads and popups, refer to our detailed module.
Javascript is great for creating dynamic and interactive content but web crawlers may experience difficulty rendering it.
This is because:
With news articles, it’s good practice to avoid interruptions such as related article carousels or image galleries.
Many publishers that do well become concerned when relaunching/redesigning their site, as it requires Google to recrawl the site. Follow these best practices to ensure a smooth transitioning to normal after redesign/relaunch:
Keep your article pages as light as possible. We’ve already looked at avoiding Javascript in articles, but it’s also good practice to avoid using heavy HTML content.
This is because when Googlebot crawls your page, it can only download a maximum of 15 MB of page data in the first crawl. For most pages, this is not a major issue as heavy weight items such as videos and images are referenced separately within the code that Googlebot indexes later, and are thus beyond the purview of this 15 MB limit.
However, this does once again point to the fact that the lighter your page, the easier it will be for Googlebot to crawl and index it.
Tip: If you want to check the size of your page, go to your browser’s Developer Tools page, switch to the Network tab, then reload the page. This displays all the requests your browser made to fully render the page. The first request on this list shows the size of your page under the Size column. For most pages on the internet, this figure is usually in kilobytes.
Article snippets give readers a preview of the content on the page before they click on it. Google determines the snippet to go with each article by crawling the text in the main body of the article just below the title.
To avoid the possibility of Googlebot including incorrect snippets make sure that:
Sometimes, Googlebot may either fail to index your image or index a different image than the one you intended to feature with your article. To avoid this, follow these best practices:
Googlebot uses your article’s title to correctly identify and index it. Use these following best practices to ensure that it reads your title accurately:
Let’s look at two case studies of sites that have already implemented the steps discussed in this article.
Modern news websites are complex and rich, and it would be unrealistic to expect them to adhere to these guidelines rigorously. However, in this section we’re trying to demonstrate how following the guidelines can result in predictable, measurable outcomes.
The Manly Observer is a hyperlocal news website catering to audiences in a popular beach-side suburb of Sydney, Australia. Below is what a typical news article on the site looks like:
We see the following elements of design clear and present at first glance:
Looking next at the page’s HTML code, we can see the use of semantic markup.
This is code that’s easily readable by a human. It is safe to presume that a web crawler will be able to read and interpret this code with equal ease.
The website uses the https:// scheme and has no pop-up ads or interstitials loading within the initial viewport.
Entrepreneur is a popular magazine for entrepreneurs and businesses. This is how its homepage appears.
The website is lightning fast to load and there are no pop-up ads or videos on the homepage itself. Most of the ad placement occurs on individual news articles.
When we click to “view source”, we see the following HTML code:
At a glance, we can make out the following from this code:
As we scroll down, we see the following code element:
We’d discussed the use of schema.org and Opengraph (OG) for images. To recap, schema.org and OG are types of structured data that help web crawlers identify specific elements of the code more easily. We see schema.org and OG implemented here.
Further down, we also see structured data tags as shown below:
As with our previous example, entrepreneur.com also uses the https:// scheme in its link, has no disruptive interstitials or pop-ups, and is fast to load. The news articles follow a set format of title, image, author attribution, date and main body of content. This results in a better page experience and hence improved technical SEO.
After working through this lesson, you should be able to review and update existing news pages to optimize their design and layout to adhere with technical SEO best practices.