We are open sourcing Feathr – the feature store we built to simplify machine learning (ML) feature management and improve developer productivity. At LinkedIn, dozens of applications use Feathr to define features, compute them for training, deploy them in production, and share the … | Continue reading
Introduction At LinkedIn, we are committed to deliver a best-in-class platform experience for our members. One of the technologies that we use to do that is Java, an object-oriented programming language that produces software for multiple platforms. We are a huge consumer of Java … | Continue reading
Co-authors: Jaewon Yang, Minji Yoon, Sufeng Niu, Dash Shi, and Qi He | Continue reading
image-of-my-sql-cluster | Continue reading
Co-authors: Dhruv Bansal, Aanchal Somani, Sneha Dewan, and Vikrant Mahajan | Continue reading
figure-of-framework | Continue reading
Co-authors: Steven Chuang, Qinyu Yue, Aravind Rao, and Srihari Duddukuru | Continue reading
Co-authors: Jaewon Yang, Jiatong Chen, and Yanen Li | Continue reading
Co-authors: Shivani Pai Kasturi and Swati Gambhir | Continue reading
Co-author: Cuong Tran Longtail latencies affect members every day and improving the response times of systems even at the 99th percentile is critical to the member's experience. There can be many causes such as slow applications, slow disk accesses, errors in the network, and man … | Continue reading
Co-authors: Keqiu Hu, Jonathan Hung, Haibo Chen, and Sriram Rao | Continue reading
Co-authors: Reza Hosseini, Albert Chen, Kaixu Yang, Sayan Patra, Rachit Arora, and Parvez Ahammad | Continue reading
While site outages are inevitable, it’s our job to minimize both the duration of outages and the likelihood for an outage to occur. One of our preemptive measures is in the way we determine overall site capacity and health on an everyday basis, in that we load-test in production. … | Continue reading
Pinot is an open source, scalable distributed OLAP data store that entered the Apache Incubation recently. Developed at LinkedIn, it works across a wide variety of production use cases to deliver real-time, low latency analytics. | Continue reading
Co-authors: Chris Li, Kevin Lau, and Subbu Sanka | Continue reading
Co-authors: Akbar KM and Kalyanasundaram Somasundaram | Continue reading
Sometimes, an engineering problem arises that might make us feel like maybe we don't know what we're doing, or at the very least, forces us out of the comfort zone of our area of expertise. That day came for the Venice team at Linkedin when we began to notice that some Venice pro … | Continue reading
Co-authors: Alexander Ivaniuk and Weitao Duan | Continue reading
Co-authors: Walaa Eldin Moustafa, Wenye Zhang, Sushant Raikar, Raymond Lam, Ron Hu, Shardul Mahadik, Laura Chen, Khai Tran, Chris Chen, and Nagarathnam Muthusamy | Continue reading
Co-authors: Xiang Zhang and Jingyu Zhu | Continue reading
When I started my journey at LinkedIn ten years ago, the company was just beginning to experience extreme growth in the volume, variety, and velocity of our data. Over the next few years, my colleagues and I in LinkedIn’s data infrastructure team built out foundational technology … | Continue reading
Pegasus Data Schema (PDSC) is a Pegasus schema definition language that has been used for data modeling with Rest.li services for years. It's the underlying language that helps define data models, describe the data returned by REST endpoints, and generate derivative schemas for o … | Continue reading
In recent years, we’ve been fortunate to see a growing number of excellent machine learning tools, such as TensorFlow, PyTorch, DeepLearning4J, and CNTK for neural networks, Spark and Kubeflow for very-large-scale pipelines, and scikit-learn, ML.NET, and the recent Tribuo for a w … | Continue reading
Co-authors: Nima Dini and Dan Sully | Continue reading
Co-author: Jitesh Gandhi and Eric Babyak | Continue reading
As companies grow, adapt, morph, and mature, one item remains the same: the need for reinvention. Technical infrastructure is no exception. As our member community grew, our priorities were to keep up with that growth, or as we say, ensure continuous “site up.” (Read: adding serv … | Continue reading
Our logo is inspired by the chameleon: You can enable personalization on your ranking model with GDMix, bringing a personalized experience to every user, like a chameleon that can match its surroundings. | Continue reading
Co-authors: Scott Meyer, Andrew Carter, and Andrew Rodriguez | Continue reading
The internet software industry has moved away from long development cycles and dedicated quality assurance (QA) stages, toward a fast-paced continuous-integration/continuous-delivery (CI/CD) pipeline, where new code is quickly written, committed, and pushed to user-facing applica … | Continue reading
Co-authors: Sandhya Ramu and Vasanth Rajamani | Continue reading
Co-authors: Weiwei Guo, Xiaowei Liu, Sida Wang, Huiji Gao, and Bo Long | Continue reading
Co-authors: Pradhan Cadabam and Jingxuan (Rex) Zhang | Continue reading
Co-authors: Tyler Grant, Armen Hamstra, Cliff Snyder | Continue reading
Built at LinkedIn, Pinot is an open source, distributed, and scalable OLAP data store that we use as our de-facto near-real-time analytics service. We’ve previously discussed how and why we built Pinot to power a wide spectrum of use cases, including internal business intelligenc … | Continue reading
Co-authors: Mars Lan, Seyi Adebajo, Shirshanka Das | Continue reading
Co-authors: Kerem Sahin, Mars Lan, and Shirshanka Das Finding the right data quickly is critical for any company that relies on big data insights to make data-driven decisions. Not only does this impact the productivity of data users (including analysts, machine learning develope … | Continue reading
Nearly 20 years after the first release of Python 2 and 11 years after the first release of Python 3, the Python development community has retired Python 2.7, the last of the Python 2 series. This marks the end of all upstream support for Python 2, including bug and security fixe … | Continue reading
Co-authors: Alexander Ivaniuk, Jingbang Liu | Continue reading
If a picture’s worth a thousand words, then what about a video? | Continue reading
Coauthor: Tim Crofts | Continue reading
As an engineer, your goal is for every commit to seamlessly land in production and provide a delightful experience for your customers. While frequent releases give you the ability to iterate and apply feedback quickly, they also require significant time, effort, and cost to achie … | Continue reading
Co-authors: Jon Lee and Wesley Wu | Continue reading
Editor's Note: This is the second in a series of posts describing how we improved productivity at scale—both in terms of lines of code and number of engineers—at LinkedIn. In our first post of the #ProductivityAtScale series, we shared details on how we improved build time by 400 … | Continue reading
Co-authors: Christian Mathiesen and Jie Zhang | Continue reading
LinkedIn started in 2003 with the goal of connecting to your network for better job opportunities. It had only 2,700 members the first week. Fast forward many years, and LinkedIn’s product portfolio, member base, and server load has grown tremendously. Today, LinkedIn operates gl … | Continue reading
As you casually scroll through a news feed, you may “like” a post here and there. “Liking” has become so second-nature that we don’t often think about what happens the minute you hit that “like” button. When we began considering building out our “likes” feature into a set of reac … | Continue reading
The Anti-Abuse AI Team at LinkedIn creates, deploys, and maintains models that detect and prevent various types of abuse, including the creation of fake accounts, member profile scraping, automated spam, and account takeovers. There are several unique challenges we face when usin … | Continue reading
The pursuit of our mission to connect the world’s professionals to make them more productive and successful is deeply dependent on the technology and infrastructure we build and maintain. Ten years ago, we had 50 million members. Fast forward five years and that number jumped to … | Continue reading