Berkeley Analytics Lab Showcase
From posters to app demos and interactive features, the Berkeley Analytics Lab Showcase is a culmination of hands-on analysis and research conducted by Berkeley Analytics students, who utilize cutting-edge analytical methods and quantitative tools to tackle real-world business and industry challenges.
This showcase will explore the transformative power of analytics across an array of industries. From sports and entertainment to the forefront of fashion, finance, generative AI, healthcare, and beyond.
View Student Projects2024 Winners
First Place
Group 6: EdgetoHats
Team: Roxanne Wang, Fengyuan Liu, Iris Zheng, Kairan Wang, Chloe Kang, Malcom Zhao, Lingxin Li, Siyi Wu
Description: Are you a hat enthusiast or an aspiring hat designer? Embrace boundless creativity with EdgeToHats! Our groundbreaking tool utilizes cutting-edge machine learning models, including conditional generative adversarial networks (CGANs), to revolutionize hat design. CGANs, adept at generating realistic images from conditional inputs, are trained on a diverse dataset of hat designs. This enables them to understand the relationship between hand-drawn sketches and final hat designs, producing personalized outputs of superior quality. By simplifying the design process, users can effortlessly explore a myriad of possibilities. With just a few strokes, users outline their desired hat, and EdgeToHats handles the rest, generating vibrant color schemes and intricate patterns to bring their visions to life. Say farewell to design limitations and welcome limitless creativity with EdgeToHats!
Second Place
Group 1: Crop Classification and Yield Prediction
Team: Rish Campbell, Adrian Enders, Ameesha Khan, Priyanka Ravichandran, Rahul Ranganathan, Rena Wang
Description: Businesses and farmers face the critical challenge of optimizing crop yields amidst fluctuating environmental conditions. Our innovative agritech platform addresses this by seamlessly integrating advanced machine-learning techniques with extensive agricultural data. Leveraging historical records and AI algorithms, our Crop Yield Prediction Model accurately forecasts future agricultural outputs, aiding in strategic decision-making for Farm Management Organizations and Corporate Farming Managers. Furthermore, our Image Classification Model precisely delineates farmland boundaries, optimizing land use and management practices, while our Crop Classification system utilizes Sentinel-2’s satellite imagery to discern crop types from above, facilitating tailored agricultural planning and monitoring.
Our methodology empowers users to input specific parameters such as soil pH, humidity, and geographical coordinates to customize predictions, ensuring relevance to local farming contexts. By offering insights on optimal planting dates, crop selection, and disease prevention techniques, our platform enables local farmers to maximize productivity while minimizing environmental impact.
In summary, our agritech platform not only addresses the pressing need for accurate crop yield predictions and efficient land management but also upholds ethical standards in its implementation, ensuring its relevance and reliability in diverse agricultural contexts.
Third Place
Group 10: Chest X-Ray Analysis
Team: .Megha, Sabrina Yan, Qingyi Fang, Alamdeep Sethi, Manuel Loaeza
Description: In the medical field, timely and accurate diagnosis of chest-related ailments from X-ray images remains a significant challenge, often requiring extensive manual effort from radiologists.
Our solution leverages advanced deep learning techniques to automate the classification of chest X-ray images into multiple diagnostic categories. By employing convolutional neural networks (CNNs) with proven architectures, our methodology involves training these models on a comprehensive dataset of over 100,000 images. This approach not only promises to enhance diagnostic accuracy but also significantly reduces the time taken to process and interpret these images, thereby improving the overall efficiency of medical diagnostics.
Event agenda
- 12:00 PM – Welcome Remarks by Student Emcees, Kairan Wang MAnalytics ’24 and Roxanne Wang MAnalytics ’24
- 12:10 PM – Opening Remarks by Professor Daniel Pirutinsky and Career Services Director Diana Chavez
- 12:15 PM – Student Project Expo Opens
- 1:40 PM – Awards
- 1:50 PM – Closing Remarks by Berkeley Analytics Program Director Alper Atamturk
- 2:00 PM – Event Closes
Guest Judges
Konsta Jokipii
Business Analytics Manager
KONE
Apurva Arni
Senior Program Manager
Tesla
Raushan Khullar
Data Scientist
Kone
Thibaut Mastrolia
Assistant Professor
Berkeley IEOR
Chiwei Yan
Assistant Professor
Berkeley IEOR
Arman Jabbari
Staff Data Scientist
Lyft
2024 Student Projects
Team: Rish Campbell, Adrian Enders, Ameesha Khan, Priyanka Ravichandran, Rahul Ranganathan, Rena Wang
Description: Businesses and farmers face the critical challenge of optimizing crop yields amidst fluctuating environmental conditions. Our innovative agritech platform addresses this by seamlessly integrating advanced machine-learning techniques with extensive agricultural data. Leveraging historical records and AI algorithms, our Crop Yield Prediction Model accurately forecasts future agricultural outputs, aiding in strategic decision-making for Farm Management Organizations and Corporate Farming Managers. Furthermore, our Image Classification Model precisely delineates farmland boundaries, optimizing land use and management practices, while our Crop Classification system utilizes Sentinel-2’s satellite imagery to discern crop types from above, facilitating tailored agricultural planning and monitoring.
Our methodology empowers users to input specific parameters such as soil pH, humidity, and geographical coordinates to customize predictions, ensuring relevance to local farming contexts. By offering insights on optimal planting dates, crop selection, and disease prevention techniques, our platform enables local farmers to maximize productivity while minimizing environmental impact.
In summary, our agritech platform not only addresses the pressing need for accurate crop yield predictions and efficient land management but also upholds ethical standards in its implementation, ensuring its relevance and reliability in diverse agricultural contexts.
Team: Xilin Tian;Xinyu Hou;Jiayi Fang;Yunqi Liang;Yue Chu;Xinyi Li
Description: Potato crops face significant threats from diseases like early and late blight, which compromise food security and economic stability worldwide. Our solution, the interactive tool known as the Potato Leaf Health Classifier, leverages the VGG16 deep learning model for rapid and accurate disease diagnosis. Users can upload images of potato leaves, and the tool efficiently classifies their health status, enabling precise disease management.
The impact of our project is substantial; it facilitates timely interventions that dramatically reduce crop damage and enhance the sustainability of potato production. This tool is transformative for disease management practices, providing farmers with a practical, user-friendly resource to maintain crop health and productivity.
Technical skills and methods utilized include the integration of the VGG16 model, renowned for its high accuracy in image recognition. Adapted through transfer learning, the model excels in detecting specific patterns indicative of potato leaf diseases. Developed with the Flask web framework, our tool ensures ease of use and accessibility, allowing even non-technical users to benefit from real-time data processing and immediate diagnostic feedback.
Team: Boya Shao, Jialin Wang, Ketong Chen, Xuyang Wu, Zilan Luo, Yihan Liao
Description: Our project introduces an innovative tool that transforms the way readers find books and publishers understand sales on Goodreads. It uses past data to predict future book sales and matches readers with books they’ll love. Our system is accessible and user-friendly, where users can easily get their results by interacting with the webpage. The system has three functions. The first one is the keyword-searching system, aiming to find the most relevant books according to the input keyword. The second function is to find books that a specific customer will take interest in, through a multi-objective optimization recommendation system. We not only find the closest distance between book vectors and user profile vectors but also ensure the recommendation quality by choosing books with a high average history rating. The third function is to predict future book sales. Based on historical sales statistics, we use time series analysis ARIMA to predict future sales, which helps Goodreads to decide inventory levels for the future more accurately and boost profit. The application we crafted is not just a testament to data-driven precision but also a reflection of our commitment to addressing the nuanced needs of readers and publishers alike, fostering a more connected and insightful Goodreads community.
Team: Brenda Liu, Calvin Li, Ian Dong, Jackie Wu, Queenie Tian, Tunan Li, Zekun Li, Zhiding Zhang
Description: Given the volatility of Bitcoin as a financial instrument, investors seek insights in regards to the underlying covariates of Bitcoin to optimize their decision-making. Moreover, personal bias often hinders the objective decisions made during trades. We used up-to-date historical Bitcoin price data and keyword trends to predict future prices of Bitcoin. Our model provides actionable insights to time market entry and exit points more effectively, addressing the challenge of market volatility. Investors can incorporate our predictive model into their institutional trading algorithms to perform risk management, complement decision-making, and ultimately increase bottom-line profits. We experimented with 9 different types of machine/deep learning models and selected XGBoost as our final predictive model. XGBoost is an ensemble learning method that uses gradient-boosting algorithms to continuously improve itself. The most important covariates to bitcoin prices are keywords such as “BIT”, “bitcoin address”, and “altcoin”. Contrary to expectation, trading volume per day has no correlation with price. We have produced a suite of interactive tools for investors to easily access our predictive model. In our interactive dashboard, investors can view the search queries of different keywords to identify keyword trends. The OHLC charts allow investors to monitor bitcoin prices across different time periods. The close price moving average allows investors to see the historical predictive capability of our model. We have also created an interactive widget for investors to learn their predicted future profits given their current holdings. Last but not least, we created an automated trading simulator that executes transactions based on the user’s expected return preferences. Our simulator outperformed the baseline model and achieved an astounding return rate of 112.91% in the short time span of 80 days (the baseline had a return rate of 15.18%).
Team: Jiayi Zhu, Yuechen Wang, Mu Cheng, Yuxin Chen, Zheying Shi
Description: In the rapidly evolving short-term rental market, Airbnb users often struggle to sift through vast listings to find accommodations that precisely meet their preferences and budget constraints. To address this challenge, our team developed a sophisticated recommendation system designed to enhance user satisfaction by personalizing search results more effectively.
Our solution employs a combination of machine learning algorithms to predict user preferences based on historical data and current search inputs. The core of our methodology involved the implementation of collaborative filtering techniques, which utilize both user and item-based similarities. We also incorporated content-based filtering to recommend properties similar to those a user has previously rated highly.
The system features an interactive dashboard built with Dash and Plotly, enabling users to adjust their preferences on-the-fly and see immediate updates in their recommendations. This interactive component not only improves user engagement but also allows the system to refine suggestions based on real-time feedback.
Through this project, we have significantly enhanced the Airbnb platform’s ability to match users with ideal listings, thereby increasing booking rates and user satisfaction. The integration of cutting-edge machine learning techniques and robust data handling strategies ensures that our solution is both scalable and adaptable to future enhancements.
Team: Fengyuan Liu, Iris Zheng, Kairan Wang, Chloe Kang, Malcom Zhao, Lingxin Li, Siyi Wu
Description: Are you a hat enthusiast or an aspiring hat designer? Embrace boundless creativity with EdgeToHats! Our groundbreaking tool utilizes cutting-edge machine learning models, including conditional generative adversarial networks (CGANs), to revolutionize hat design. CGANs, adept at generating realistic images from conditional inputs, are trained on a diverse dataset of hat designs. This enables them to understand the relationship between hand-drawn sketches and final hat designs, producing personalized outputs of superior quality. By simplifying the design process, users can effortlessly explore a myriad of possibilities. With just a few strokes, users outline their desired hat, and EdgeToHats handles the rest, generating vibrant color schemes and intricate patterns to bring their visions to life. Say farewell to design limitations and welcome limitless creativity with EdgeToHats!
Team: Qinyi(Selina) Zhang; Ruizhe(Allen) Zhou; Xiaozhe(Jack) Liu; Zehua(Henry) Qiu; Ziyang(Peter) Wei; Zijun(Sibyl) Lin
Description: In the competitive arena of digital music streaming, users often face difficulties in discovering a playlist of new songs that truly match their tastes, leading to reduced engagement and satisfaction. To tackle this challenge, we propose the integration of a sophisticated recommendation system within the Spotify app, specifically designed to enhance user experience by delivering personalized playlists based on user-provided input playlists.
Our methodology utilizes an autoencoder-based clustering approach, a content-based recommendation technique that deeply analyzes the intrinsic features of songs in a user’s playlist. The autoencoder, a type of neural network, compresses song data into a lower-dimensional space and reconstructs it to capture the essential characteristics of each track. This process allows for the grouping of songs with similar features into clusters, making it possible to identify and recommend new tracks that share these qualities, yet add variety.
This innovative system not only recommends songs that are stylistically and thematically aligned with the user’s existing preferences but also uncovers hidden gems within Spotify’s vast library, thus broadening the user’s musical horizon. By integrating this feature directly into the Spotify app, we aim to make music discovery more intuitive and deeply personalized, significantly enhancing user engagement and retention. This addresses a critical business need for Spotify by improving user satisfaction and encouraging longer, more frequent interactions with the platform.
Team: Hyunjoon Kweon, Andy Zhou, Jenny Tu, Weixiao Wang, Xiaorun Xue, Xucen Liao, Yuan Lu
Description: Navigating the dynamic financial market requires sophisticated strategies to comprehend stock price fluctuations and manage risks effectively. Our project addresses this challenge head-on by employing advanced techniques to predict stock price movements and offer tailored recommendations to stakeholders, thereby facilitating informed decision-making and risk mitigation. In this project, our team leverages publicly available market indicators and conducts sentiment analysis on textual data extracted from financial reports (10-Qs). Moreover, by deploying machine learning algorithms and time series models meticulously refined, we achieve a notable 23.2% reduction in the mean absolute error compared to the baseline model. This integrated approach ensures that investors and traders benefit from reliable forecasts, enabling them to navigate the complexities of the financial landscape with confidence and precision.
Team: Ziyi He, Yu Tian, Yifang Liu, Roxie Zhao, Jeremy Mao
Description: In the digital age, the abundance of online streaming content can be both a blessing and a curse. With millions of movies just a click away, finding the right one can often feel like searching for a needle in a haystack. This overwhelming choice paradox not only diminishes user satisfaction but also affects platform loyalty and engagement negatively. To tackle this issue head-on, we introduce our revolutionary movie recommendation system, engineered to redefine the way viewers connect with movies they love.
Our system presents a groundbreaking solution by employing a sophisticated hybrid model that combines the strengths of multiple recommendation techniques. By integrating the personal touch of collaborative filtering, the specificity of content-based filtering, and the predictive power of deep learning algorithms, our hybrid approach ensures a highly personalized and accurate movie discovery experience. This means our recommendations are not just based on what others with similar tastes have enjoyed or the genres you prefer, but also on a deep understanding of your unique viewing habits and preferences over time.
The magic of our hybrid model lies in its ability to learn from a comprehensive dataset of user interactions, movie metadata, and contextual information, allowing it to constantly adapt and refine its suggestions to suit each user’s evolving tastes. Whether you’re a fan of undiscovered indie gems or blockbuster hits, our system narrows down the endless possibilities to those movies that are just right for you, making movie night decisions quick, easy, and satisfying.
Embrace the future of personalized entertainment with our movie recommendation system, where discovering your next movie obsession is effortlessly intuitive, uniquely yours, and just a click away.
Team: .Megha, Sabrina Yan , Qingyi Fang, Alamdeep Sethi, Manuel Loaeza
Description: In the medical field, timely and accurate diagnosis of chest-related ailments from X-ray images remains a significant challenge, often requiring extensive manual effort from radiologists.
Our solution leverages advanced deep learning techniques to automate the classification of chest X-ray images into multiple diagnostic categories. By employing convolutional neural networks (CNNs) with proven architectures, our methodology involves training these models on a comprehensive dataset of over 100,000 images. This approach not only promises to enhance diagnostic accuracy but also significantly reduces the time taken to process and interpret these images, thereby improving the overall efficiency of medical diagnostics.
Team: Zhaoting Qiu, Zihao Yang, Wenzhe Gao, Cassidy Dong, Xiaoning Sun, Kaihui Xie, Qianyi Zhang
Description: Various target audiences have needs in understanding the soccer game and we provide information for them. Our project creates an Web-based interactive platform for outcome predictions in soccer, particularly focusing on shooting and passing. The reviews we generated help soccer teams improve team performance based on past matches and real-time inputs. It allows soccer enthusiasts to better understand the game. It also helps Analysts and journalists gain meaningful insights. We use machine learning and deep learning algorithms to predict goal probability and passing success.(Methods: Decision Tree; XGBoost; MLP 2-hidden-layer; Random Forest; CNN) Users can adjust the interactive toolbar of various factors to simulate different scenarios and get the results they need. It also supports users to upload game events data and check the game performance and statistics.
Team: Yuejia Li, Zhiyi wang, Junyao LU, Binglan Lin, Zhihao Du, Tianyi Zhang, Zhe Wang, Zongxin Chen
Description: In response to the challenges of information overload and the difficulty in discovering personalized and quality business recommendations, our project developed an advanced Yelp Recommendation System. Our solution harnesses the vast dataset provided by Yelp, encompassing millions of user reviews, business attributes, and user interactions, to create a sophisticated recommendation engine. We employed a multi-faceted methodology that integrates Singular Value Decomposition (SVD), k-nearest neighbors (KNN), Random Forest, and Neural Collaborative Filtering (NCF) to predict user preferences with high accuracy. Our system stands out by addressing both individual and business user needs, offering tailored recommendations that enhance user satisfaction, streamline search efficiency, and ensure relevance. For individual users, it means personalized dining experiences and effortless discovery of local eateries and trending spots, while businesses benefit from increased visibility and insightful consumer behavior analytics. This dual approach not only promotes a richer user experience but also fosters a vibrant, competitive marketplace for businesses of all sizes. Our project’s impact is further amplified through an interactive tool that enables real-time, user-driven recommendations, offering a dynamic and engaging platform for exploring local businesses. This recommendation system represents a significant advancement in data-driven, personalized user experiences in the digital marketplace.
Team: Yanze Li, Lei Zhang, Xinze Chen, Junzhou Ching, Jiahui Wang
Description: Our team has created an interactive dashboard that provides real-time forecasts of QQQ ETF trading volume, offering both aggregate (15 and 30-minute sums) and minute-by-minute forecasts for the upcoming 15 to 30 minutes. Users can view data in aggregate or by individual minutes, with key indicators such as volatility and forecasted volume changes. The dashboard updates every 15 minutes, automatically fetching data from the Financial Modeling Prep API, retraining the bidirectional LSTM model with 390 minutes of lag, and refreshing the display. This tool is tailored for intraday stock traders, providing them with insights to trade at times of higher market liquidity for better outcomes.
If you require an accommodation for effective communication (ASL interpreting/CART captioning, alternative media formats, etc.), please reach out to Jenny Huang at jrhuang@berkeley.edu by April 9, 2024.