At Scale By the Bay, we love learning about the stories behind the most beloved technology companies (take this story of a retired programmer who charts his cognitive decline using Grammarly). Today, we are talking to Umayah Abdennabi, a Software Engineer on the Data Team at Grammarly who works on the company’s internal data analytics platform.
At Scale By The Bay, Umayah will dive deep into iterators, and ways to use them in the codebase. In advance of his talk, Umayah shares his story, including his interest in Scala, his work at Grammarly, and the key trends that are shaping the future of Scala and Big Data.
How did you get interested in Big Data and Scala and what was the turning point when you joined Grammarly?
I have taken up an interest in Big Data and Functional Programming since learning about them in college. Before joining Grammarly, I was learning about Spark which introduced me to Scala. What made Scala appealing to me, is that it's object-oriented, functional and runs on the JVM, and it has an active community. Also, Scala was gaining a lot of traction with more companies using Spark, Kafka, and other projects adopting Scala. I started looking for a career that would allow me to use my skills on a larger scale. Grammarly attracted me by its strong engineering culture and technology I wanted to learn about.
What's your current role, and what exciting things are you working on at the moment?
Currently, I am a software engineer on the Data Team, where I work on an internal data analytics platform with a team of five engineers and three data scientists. Three exciting things we are working on are building a custom experiment framework, optimizing our pipeline to handle higher rates of more granular data, and decreasing lag on when an event becomes queryable.
What's the biggest challenge that you face in your work and how are you addressing the challenge?
We face different types of challenges on our team. One recent challenge we faced was reducing the lag from when an event hits our endpoint to when it’s indexed and queryable by our users. Some things we did to address it included changing how we store and retrieve data between each pipeline job and reducing dependencies between jobs, allowing them to run concurrently.
What's the biggest thing that is misunderstood about Big Data and Scala?
Maintaining a scalable, precise, and robust big data system requires a large amount of work and effort. Proper processes and tooling in place, well-tested clean code, and large amounts of detailed knowledge about how everything works are required for any big data platform to work.
Given the need for building Big Data systems, the fact that Scala, with its powerful abstractions to reason about your code, has become the language of choice for big data systems is not random.
What are the three trends that will shape the future of the space?
Laws and regulations regarding how data is used and stored, having stronger guarantees across heterogeneous subsystems that scale, compiler and virtual machine optimizations
What will you talk about at Scale By the Bay and why did you choose to cover this subject?
I will be talking about iterators. Iterators are constructs that we deal with frequently knowingly or not. I wanted to look deeply into a simple concept and understand it at a more fundamental level. Doing so allows one to write clean, high quality, and scalable code.
Who should attend your talk and what will they learn?
Anyone who wants a light and easy talk. Hopefully, they will have a better understanding of iterators and start looking into other concepts more deeply.