Announcing the Just-Discovered Links Report
Posted by The_Tela
Hey everyone, I'm Tela. I head up data planning at SEOmoz, working on our indexes, our Mozscape API, and other really fun technical and data-focused products. This is actually my first post on the blog, and I get to announce a brand new feature – fun!
One of the challenges inbound marketers face is knowing when a new link has surfaced. Today, we're thrilled to announce a new feature in Open Site Explorer that helps you discover new links within an hour of them going up on the web: the Just-Discovered Links report.
This report helps you capitalize on links while they're still fresh, see how your content is resonating through social channels, gauge overall sentiment of the links being shared, give you a head start on instant outreach campaigns, and scope out which links your competitors are getting. Just-Discovered Links is in beta, and you can find it in Open Site Explorer as a new tab on the right. Ready to learn more? Let's go!
What is the Just-Discovered Links report?
This report is driven by a new SEOmoz index that is independent from the Mozscape index, and is populated with URLs that are shared on Twitter. This means that if you would like to have a URL included in the index, just tweet it through any Twitter account.
One note: The cralwers respect robots.txt and politeness rules, which would prevent such URLs from being indexed. Also, we won't index URLs that return a 500 status code.
Who is it for?
Our toolsets and data sources are expanding to support a wider set of inbound marketing activities, but we designed Just-Discovered Links with link builders in mind.
You can search Just-Discovered Links through the main search box on Open Site Explorer. Enter a domain, subdomain, or specific URL just as you would when using the Inbound Links report. Then select the Just-Discovered Links beta tab. The report gives PRO members up to 10,000 links with anchor text and the destination URL, as well as Domain Authority and Page Authority metrics.
One important note on Page Authority: we will generally not have a Page Authority score available for new URLs, and will show [No data] in this case. So, when you see [No data], it generally indicates a link on a new page.
You can also filter the results using many of the same filter drop-downs you are used to using in other reports in Open Site Explorer. These include followed and no-followed links, and 301s; as well as internal or external links, and links to specific pages or subdomains. Note: We recommend you start searches using the default "pages on this root domain" query, and refine your search from there.
How does it work?
When a link is tweeted, we crawl that URL within minutes. We also crawl all of the links on the page that have been tweeted. These URLs, their anchor text, and their meta data (such as nofollow, redirect, and more) are stored and indexed. It may take up to an hour for links to be retrieved, crawled, and indexed.
We were able to build this feature rapidly by reusing much of the technology stack from Fresh Web Explorer. The indexes and implementation are a little different, but the underlying technology is the same. Dan Lecocq, the lead engineer on both projects, recently wrote an excellent post explaining the crawling and indexing infrastructure we use for Fresh Web explorer.
There are a few notable differences: we don’t use a crawl scheduler because we just index tweeted URLs as they come in. That’s how we are able to include URLs quickly. Also, unlike Fresh Web Explorer, the Just-Discovered Links report is focused exclusively on anchor text and URLs, so we don’t do any de-chroming as that would mean excluding some links that could be valuable.
How is it different?
Freshness of data continues to be a top priority when we design new products. We have traditionally released indexes on the timeframe of weeks. With this report, we have a new link index that is updated in about an hour. From weeks to an hour – wow! We'll be providing additional details in the future on what this means.
This index includes valuable links that may be high-quality and topically relevant to your site or specific URL but are new, and thus have a low Page Authority score. This means they may not be included in the Mozscape index until they have been established and earned their own links. With this new index, we expect to uncover high-quality links significantly faster than they would appear in Mozscape.
I want to clarify that we are not injecting URLs from the Just-Discovered Links report into our Mozscape index. We will be able to do this in the future, but we want to gather customer feedback and understand usage before connecting these two indexes. So for now, the indexes are completely separate.
How big is the index?
We have seeded the index and are adding new URLs as they are shared, but don’t yet have a full 30 days worth of data in the index. We are projecting that the index will include between 250 million and 300 million URLs when full. We keep adding data, and will be at full capacity in the next week.
How long will URLs stay in the index?
We are keeping URLs in the index for 30 days. After that, URLs will fall out of the index and not appear in the Just-Discovered Links report. However, you can tweet the URL and it will be included again.
How long does it take to index a URL?
We are able to crawl and include URLs in the live index within an hour of being shared on Twitter. You may see URLs appear in the report more quickly, but generally you can expect it to take about an hour.
Why did you choose Twitter as a data source?
About 10% of tweets include URLs, and many Twitter users share links as a primary activity. However, we would like to include other data sources that are of value. I’d love to hear from folks in the comments below on data sources they would like to see us consider for inclusion in this report.
How much data can I get?
The Just-Discovered Links report has the same usage limits as the Inbound Links report in Open Site Explorer. PRO customers can retrieve 10,000 results per day, community members can get 20 results, and guests can see the first five results.
What is “UTC” in the Date Crawled column?
We report time in UTC, or Coordinated Universal Time format. This time format will be familiar for our European customers, but might not be as familiar for customers in the states. The time zones for UTC are ahead of Eastern Standard Time, so US customers will see links where the time-stamp appears to be in the future, but this is really just a time zone issue. We can discover links quickly, but can’t predict links before they happen. Yet, anyways
You can export a CSV with the results from your Just-Discovered Links report search. The CSV export will be limited to 5,000 links for now. We plan to increase this to 10,000 rows of data in the near future. We need to re-tool some of Open Site Explorer’s data storage infrastructure before we can offer a larger exports, and don’t have an exact ETA for this addition quite yet.
This is a beta release
We wanted to roll this out quickly so we can gather feedback from our customers on how they use this data, and on overall features. We have a survey where you can make suggestions for improving the feature and leave feedback. However, please keep in mind the fact that this is a beta when deciding how to use this data as part of your workflow. We may make changes based on feedback we get that result in changes to the reports.
Top four ways to use Just-Discovered Links
Quick outreach is critical for link building. The Just-Discovered Links report helps you find link opportunities within a short time of being shared, increasing the likelihood that you’ll be able to earn short-term link-building wins and build a relationship with long-term value. Here are four ways to use the recency of these links to help your SEO efforts:
- Link building: Download the CSV and sort based on anchor text to focus on keywords you are interested in. Are there any no-followed links you could get switched to followed? Sort by Domain Authority for new links to prioritize your efforts.
- Competitor research: See links to your competitor as they stream-in. Filter out internal links to understand their link building strategy. See where they are getting followed links and no-followed links. You can also identify low-quality link sources that you may want to avoid. Filter by internal links for your competitors to identify issues with their information architecture. Are lots of their shared links 301s? Are they no-following internal links on a regular basis?
- Your broken links: The CSV export shows the http status code for links. Use this to find 404 links to your site and reach-out to get the links changed to a working URL.
- Competitor broken links: Find broken links going to your competitors’ sites. Reach out and have them link to your site instead.
Ready to find some links?
We’ve been releasing new versions of our Mozscape index about every two weeks. An index that is continuously updated within an hour is new for us, too, and we’re still learning how this can make a positive impact on your workflow. Just as with the release of Fresh Web Explorer, we would love to get feedback from you on how you use this report, as well as any issues that you uncover so we can address them quickly.
The report is live and ready to use now. Head on over to Open Site Explorer’s new Just-Discovered Links tab and get started!
Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!
- Announcing the March Mozscape Index!
- All Links are Not Created Equal: 10 Illustrations on Search Engines’ Valuation of Links
- All Links are Not Created Equal: 10 Illustrations on Search Engines’ Valuation of Links
- One Giant Leap for Link Data: Announcing Open Site Explorer + Page/Domain Authority Metrics
- Announcing Fresh Web Explorer