JS Tips Series - innerText.match()
JS Tips Series - innerText.match()

JS Tips Series - innerText.match()

📅

Published Nov 16, 2020 by

Intro

This quick article shows you how simply and easily you can parse a website text and count the number of words.

💡

I was recently watching Kyle Robinson Young's Youtube video on How To Make Chrome Extensions and found this useful gem.

Code + Explanation

Here's the full code:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <meta http-equiv="X-UA-Compatible" content="ie=edge" />
    <title>Static Template</title>
    <script>
      document.addEventListener("DOMContentLoaded", function () {
        const re = new RegExp("ideas", "gi");
        const matched_word_count = document.documentElement.innerText.match(re);
        console.log(matched_word_count.length); // 2
      });
    </script>
  </head>
  <body>
    <h1>What am I upto?</h1>
    <p>
      I have some very abitious product ideas. User experience and making a
      meaningful difference in our users lives are at the core of all these
      product ideas. I can't wait to see them come to life!
    </p>

    <p>
      I started my journey in product development as a designer initially. Then my interest split between user experience and web development. What I love about programming is the vastness of this ocean. There is so much explore,
      discover, learn, and contribute!
    </p>
  </body>
</html>

Explanation:

Our main interest will be in line 11. We introduce a constant matched_word_count that will store the array result from innerText.match(re). Here, re is the regex that will consist of the pattern we are looking to match in the document.documentElementthat consists of the text string of the webpage.

//line 11
const matched_word_count = document.documentElement.innerText.match(re);

The array returned in matched_word_count contains all the matched words with the regex on line 10.

//line 10
const re = new RegExp("ideas", "gi");

About the regex:

  • It's essential looking for the string "ideas"
  • The gi modifier is used to do a case insensitive search of all occurrences of a regular expression in a string - source: w3schools.com

We then look at the array's length on line 12 to count the number of words found.

console.log(matched_word_count.length); // 2

I'll be covering more about the new RegExp in a future post.

⚠️

A newbie mistake I made when running this code. I didn't wrap my code with the DOMContentLoadedevent listener. This caused our JS code to run before the DOM was fully parsed and no match was returned. Make sure to wrap your code with document.addEventListener("DOMContentLoaded", function () { ... }); to avoid this mistake.

Some Use Case Ideas:

  • You need to perform number of words in a given web page
  • Simple search to check if a particular word appears on the page

Up next:

  • I'll do a write up exploring new RegEx()