Welcome!

Python Authors: Matt Davis, Jyoti Bansal, Pat Romanski, Donald Meyer, Liz McMillan

Related Topics: Open Source Cloud, Java IoT, Industrial IoT, Microsoft Cloud, Machine Learning , Python

Open Source Cloud: Article

Python and gevent

Let’s talk about event loops

The easiest way to make your code run faster is to do less. At some point, though, you don’t want to do less. Maybe you want to do more, without it being any slower. Maybe you want to make what you have fast, without cutting out any of the work. What then? In this enlightened age, the answer is easy — parallelize it! Threads are always a good choice, but without careful consideration, it’s easy to create all manner of strange race conditions or out-of-order data access. So today, let’s talk about a different route: event loops.

Event whats?
If you’re not familiar with evented code, it’s a way to parallelize execution, similar to threading or multiprocessing. Unlike threads, though, evented code is typically cooperative — each execution path must voluntarily give up control. Each of these execution units actually runs in serial, and when finished, returns control to the main loop. The parallelization gain comes from cleverly dividing the work so that when they make a blocking call (i.e. a DB call, HTTP request, or disk access), they give up control, letting the main event loop run other functions and wait for the call to return. This is perfect for cases where you do a lot of I/O and relatively little work in the evented thread itself.

In my case, I’m interested in doing a number of heterogeneous, but related, data lookups in response to a web request. We’re running all of this behind our data access layer, which is an evented Thrift server. I have a number of functions (with a common API) I’m interested in running, and a naive implementation would look like this:

Python-Gevent-1

What does that do?

If we run this on a machine with TraceView installed, we’ll see the following request structure:

Pretty predictable. We called into each function serially, which is exactly what we said we’d do. We can also look at the raw events TraceView collected, and they tell a similar story:

All together now!

This seems like a good baseline, but let’s see what happens when we parallelize it. Let’s use Python’s gevent, which has two major selling points. First, it implements an event loop based on libevent, which means we won’t have to worry about actually implementing the event loop. We can just spawn separate greenlet (i.e., non-native) tasks, and gevent will handle all the scheduling. The other big advantage is that gevent knows about and can monkeypatch existing synchronous libraries to cede control when they block. This means that outside of the actual event spawning, we can leave our existing code untouched, and our external calls will just do the right thing.

That just leaves the questions of how to break up the work into parallel coroutines. It seems natural to give each of these functions their own, and then have our “main thread” wait for them all the finish. We do this by firing off each function in a separate task and collecting them in a list. We then wait for all of those tasks to finish and collect the results. Easy! Here’s what the same function as above looks like, but evented:

Python-Gevent-2

This calling change is all* that’s necessary to parallelize these functions! The next question is, did that help? Let’s look at the same graphs we had before, but now for the evented case:

Definitely different! Instead of running everything sequentially, we can see all seven functions running at the same time. As we’d hoped, this has a major impact on our total response time, as well. It’s 500ms faster — a speedup of 2x!

*Caveats
OK, so it’s not as perfectly simple as this example. There are a few “gotchas” that are worth bearing in mind when you start to use this in a real example.

This first one is that gevent mimics separate threads for each coroutine. This means that if you’re storing global state that’s thread-aware, gevent may discard it. Notably, Pylons/Pyramid uses thread-safe object proxies to store global request state, which means that new coroutines will hide that information from you. In our production version of this code, we explicitly pass that state from caller to callee, then set it in the global “pylons.request” object before running the function. It lets us seemlessly mix evented and non-evented functions, while only thinking about the details of gevent in one place.

The second big gotcha is error handling. Since these aren’t normal function calls, exceptions don’t propagate to the caller. They must be explicitly checked for on the task and re-thrown, if appropriate. This sort of error-case-checking is familiar to any C programmer, but it’s different than the normal Python idiom, so it’s worth thinking about.

Another caveat is that spawning multiple events doesn’t actually get you code-level parallelization. It runs blocking calls in parallel, but you still only get one interpreter thread to run your Python (no magic GIL sidestep here!). If you’re looking to speed up heavy computations or other CPU-intensive work, check out the multiprocessing module. Eventing really shines when the majority of your work is database calls, file access, or other blocking, out-of-process work.

Finally, if you’re looking to trace these kinds of calls with TraceView (like we did here), it’s pretty straightforward. The only thing to remember is to wrap your evented function calls using “oboe.log_method”, and pass “entry_kvs={‘Async’: True}”. The ensures that we calculate the timing information properly for all your parallel work.

But that’s it! You can use this technique to speed up existing projects, or build something an entirely new with gevent at the core. What are you planning on doing with it?

And, of course, if you want to instrument your evented project now, sign up for a 14-day trial of TraceView today!

Related Articles

More Stories By TR Jordan

A veteran of MIT’s Lincoln Labs, TR is a reformed physicist and full-stack hacker – for some limited definition of full stack. After a few years as Software Development Lead with Thermopylae Science and Techology, he left to join Tracelytics as its first engineer. Following Tracelytics merger with AppNeta, TR was tapped to run all of its developer and market evangelism efforts. TR still harbors a not-so-secret love for Matlab-esque graphs and half-baked statistics, as well as elegant and highly-performant code. Read more of his articles at www.appneta.com/blog or visit www.appneta.com.

@ThingsExpo Stories
IoT is at the core or many Digital Transformation initiatives with the goal of re-inventing a company's business model. We all agree that collecting relevant IoT data will result in massive amounts of data needing to be stored. However, with the rapid development of IoT devices and ongoing business model transformation, we are not able to predict the volume and growth of IoT data. And with the lack of IoT history, traditional methods of IT and infrastructure planning based on the past do not app...
Internet-of-Things discussions can end up either going down the consumer gadget rabbit hole or focused on the sort of data logging that industrial manufacturers have been doing forever. However, in fact, companies today are already using IoT data both to optimize their operational technology and to improve the experience of customer interactions in novel ways. In his session at @ThingsExpo, Gordon Haff, Red Hat Technology Evangelist, shared examples from a wide range of industries – including en...
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. Jack Norris reviews best practices to show how companies develop, deploy, and dynamically update these applications and how this data-first...
Intelligent Automation is now one of the key business imperatives for CIOs and CISOs impacting all areas of business today. In his session at 21st Cloud Expo, Brian Boeggeman, VP Alliances & Partnerships at Ayehu, will talk about how business value is created and delivered through intelligent automation to today’s enterprises. The open ecosystem platform approach toward Intelligent Automation that Ayehu delivers to the market is core to enabling the creation of the self-driving enterprise.
"We're a cybersecurity firm that specializes in engineering security solutions both at the software and hardware level. Security cannot be an after-the-fact afterthought, which is what it's become," stated Richard Blech, Chief Executive Officer at Secure Channels, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Consumers increasingly expect their electronic "things" to be connected to smart phones, tablets and the Internet. When that thing happens to be a medical device, the risks and benefits of connectivity must be carefully weighed. Once the decision is made that connecting the device is beneficial, medical device manufacturers must design their products to maintain patient safety and prevent compromised personal health information in the face of cybersecurity threats. In his session at @ThingsExpo...
Detecting internal user threats in the Big Data eco-system is challenging and cumbersome. Many organizations monitor internal usage of the Big Data eco-system using a set of alerts. This is not a scalable process given the increase in the number of alerts with the accelerating growth in data volume and user base. Organizations are increasingly leveraging machine learning to monitor only those data elements that are sensitive and critical, autonomously establish monitoring policies, and to detect...
SYS-CON Events announced today that Grape Up will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company specializing in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the U.S. and Europe, Grape Up works with a variety of customers from emergi...
SYS-CON Events announced today that Massive Networks will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Massive Networks mission is simple. To help your business operate seamlessly with fast, reliable, and secure internet and network solutions. Improve your customer's experience with outstanding connections to your cloud.
Everything run by electricity will eventually be connected to the Internet. Get ahead of the Internet of Things revolution and join Akvelon expert and IoT industry leader, Sergey Grebnov, in his session at @ThingsExpo, for an educational dive into the world of managing your home, workplace and all the devices they contain with the power of machine-based AI and intelligent Bot services for a completely streamlined experience.
Because IoT devices are deployed in mission-critical environments more than ever before, it’s increasingly imperative they be truly smart. IoT sensors simply stockpiling data isn’t useful. IoT must be artificially and naturally intelligent in order to provide more value In his session at @ThingsExpo, John Crupi, Vice President and Engineering System Architect at Greenwave Systems, will discuss how IoT artificial intelligence (AI) can be carried out via edge analytics and machine learning techn...
SYS-CON Events announced today that Datera, that offers a radically new data management architecture, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera is transforming the traditional datacenter model through modern cloud simplicity. The technology industry is at another major inflection point. The rise of mobile, the Internet of Things, data storage and Big...
In the enterprise today, connected IoT devices are everywhere – both inside and outside corporate environments. The need to identify, manage, control and secure a quickly growing web of connections and outside devices is making the already challenging task of security even more important, and onerous. In his session at @ThingsExpo, Rich Boyer, CISO and Chief Architect for Security at NTT i3, discussed new ways of thinking and the approaches needed to address the emerging challenges of security i...
SYS-CON Events announced today that GrapeUp, the leading provider of rapid product development at the speed of business, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market acr...
From 2013, NTT Communications has been providing cPaaS service, SkyWay. Its customer’s expectations for leveraging WebRTC technology are not only typical real-time communication use cases such as Web conference, remote education, but also IoT use cases such as remote camera monitoring, smart-glass, and robotic. Because of this, NTT Communications has numerous IoT business use-cases that its customers are developing on top of PaaS. WebRTC will lead IoT businesses to be more innovative and address...
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business - from apparel to energy - is being rewritten by software. From planning to development to management to security, CA creates software that fuels transformation for companies in the applic...
In his opening keynote at 20th Cloud Expo, Michael Maximilien, Research Scientist, Architect, and Engineer at IBM, discussed the full potential of the cloud and social data requires artificial intelligence. By mixing Cloud Foundry and the rich set of Watson services, IBM's Bluemix is the best cloud operating system for enterprises today, providing rapid development and deployment of applications that can take advantage of the rich catalog of Watson services to help drive insights from the vast t...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
Recently, IoT seems emerging as a solution vehicle for data analytics on real-world scenarios from setting a room temperature setting to predicting a component failure of an aircraft. Compared with developing an application or deploying a cloud service, is an IoT solution unique? If so, how? How does a typical IoT solution architecture consist? And what are the essential components and how are they relevant to each other? How does the security play out? What are the best practices in formulating...
SYS-CON Events announced today that Elastifile will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Elastifile Cloud File System (ECFS) is software-defined data infrastructure designed for seamless and efficient management of dynamic workloads across heterogeneous environments. Elastifile provides the architecture needed to optimize your hybrid cloud environment, by facilitating efficient...