Skip to main content

Processing a large dataset in less than 100 lines of Node.js with async.queue

If you’re more of a skip to the code person, check out the gist here.

caolan’s async.queue to the rescue


To fix the call stack issue needed to manage API calls by pushing them into a queue where they could be processed in parallel. 

Pushing items to the queue

Image IDs are in a newline delimited JSON file. First convert this file into a JSON object using readFileSync. The object contains a list of image IDs and in queue want to send each image to the Vision API. The queue takes a task (in this case my object of image IDs) and a callback function, called when the worker is finished processing:

q.push(imageIds, function (err) {
 if (err) {
  console.log(err)
 }
});


Defining the queue

The queue takes a function and a concurrency number as parameters. Let’s start with the function: we pass it a task (our image ID from above) and a callback, which will be called when the worker completes a task. Inside the function is where do image processing.
This function should return some JSON about the image which want to write to a local JSON file. Will define that in the next step.

                        Concurrency tells Node.js the maximum number of workers to process our task in parallel. Playing with the number until  found a balance of something that wasn’t too slow, but also didn’t result in API limits or call stack errors. The number will vary depending on what you’re doing, so it’s definitely ok to fine tune it by hand until you find your “magic number.” Here’s queue:

let q = async.queue(callVision, 20);

Processing images


Last, it’s time to write the callVision() function referenced above. This part isn’t exactly async.queue specific, but it’s still important because it’s the meat of my queue task. 
Here using Google’s Cloud Vision API for image analysis, and use the Google Cloud Node.js module to call it. Once get a JSON response for each image, create a JSON string of the response to write to a newline delimited JSON file (using this format because it’s  what BigQuery expects, which is where will be storing the data eventually). Once this function completes, the data is sent back to the queue where it is written to local JSON file. You can find all of the callVision() code in the gist.

That’s it!  you’ve done something interesting with async.queue 



                                                                                        *Sara Robinson






Comments

Popular posts from this blog

Woot, 2.4.0 is out! : vue.js

FeaturesFull SSR + async component support in core: SSR now supports rendering async components used anywhere and the client also supports async components during the hydration phase. This means async components / code-splitting now just works during SSR and is no longer limited at the route level. (9cf6646 & 7404091) Easier creation of wrapper components: (6118759) New component option: inheritAttrs. Turns off the default behavior where
parent scope non-prop bindings are automatically inherited on component root
as attributes. New instance properties: $attrs & $listeners. $attrs contains the parent-scope attribute bindings that were not recognized as props, and $listeners contains the v-on listeners registered in the parent scope (without the .native modifier). These are essentially aliases of $vnode.data.attrs and $vnode.data.on, but are reactive. Combining these allows us to simplify a component like this down into this: <div> <inputv-bind="$attrs"v-on=&qu…

Get start with Vue.js

Getting Started The official guide assumes intermediate level knowledge of HTML, CSS, and JavaScript. If you are totally new to frontend development, it might not be the best idea to jump right into a framework as your first step - grasp the basics then come back! Prior experience with other frameworks helps, but is not required. The easiest way to try out Vue.js is using theJSFiddle Hello World example. Feel free to open it in another tab and follow along as we go through some basic examples. Or, you can simplycreate anindex.htmlfileand include Vue with: <scriptsrc="https://unpkg.com/vue"></script> TheInstallationpage provides more options of installing Vue. Note that wedo notrecommend beginners to start withvue-cli, especially if you are not yet familiar with Node.js-based build tools. Declarative Rendering At the core of Vue.js is a system that enables us to declaratively render data to the DOM using straightforward template syntax: <divid="app"> …

24 Must Have WordPress Plugins for Business Websites- Part 1

1. OptinMonsterOptinMonster is the most popular conversion rate optimization software. It allows you to convert abandoning website visitors into email subscribers. If you want to grow your email list, then this is a must have WordPress plugin in 2017. Read these case studies to see how much success other businesses are having by using OptinMonster.
2. WPForms As a business owner, allowing your customers to contact you should be your top priority. WPForms is the most beginner friendly contact form plugin for WordPress. This drag & drop online form builder allows you to easily create contact forms, email subscription forms, order forms, payment forms, and other type of online forms with just a few clicks. We use it on WPBeginner and all of our other sites. There’s a free WPForms Lite version available for those who are looking for a simple solution. If you want more advanced features, then get the Pro version. Use this WPForms coupon to get 10% off your purchase. 3. MonsterInsights Mon…