My Summer Research Internship at SciTech Lab

with No Comments

Note the author:
Kelsie Lam is a senior at Mission San Jose High School. She is a SciTech research intern interested in artificial intelligence and cognitive science. In the lab, Kelsie investigates how scientists use machine learning methods to support their research and works on developing workflows. Github: https://github.com/kelsielam


During my internship at the SciTech lab, I learned about workflow management systems, specifically Pegasus (an open-source scientific workflow management system); what it does, how it works, and why it is beneficial for scientists to use for their research. I was fortunate to have Patrycja Krawczuk, a second year PhD student in the lab with extensive workflow and machine learning knowledge, as my mentor to teach me how to create and execute a workflow, identify the errors that occur, and how to resolve them. (Check out Fig. 1 taken during one of our Zoom meetings below ;))

Fig. 1: Zoom meeting with Patrycja Krawczuk
Fig. 2: This python script was written to easily import and open all the images in the Images file so it could be resized.

Going into this internship, I only had a little experience with coding and didn’t know much about machine learning. However, after learning and practicing with different workflows, I was able to successfully write a code for a workflow that resizes any images placed in a certain folder (Fig. 2).

Additionally, I worked on a text analysis workflow that can count the number of times a word shows up in a Harry Potter book (Figs. 3 and 4). With my newly acquired Python skills, I edited the code so that all lowercase and uppercase letters are counted the same and all special characters are removed (Fig. 5). These edits to the Python code allowed the workflow to provide an accurate output. Though I experienced various errors while working on these workflows, I learned from each mistake and now have a better understanding as to why certain lines of code were not working and how to be more efficient in the future.

Fig. 3: This diagram gives a simple representation of the workflow and the different steps and scripts it took to produce the output.

Fig. 4: This graph was produced as the final output in the text analysis workflow and shows the top 25 most used words in the book.

In preparation for conducting my workflow, I referred to many research papers and analyzed each machine learning workflow step that took place. One particular paper that I studied, focused on how deep learning is used for climate models. Through this paper, I was able to understand the problem the scientists are addressing, how they solve that issue, and how Pegasus and Panorama 360 could be useful for their research. 

When I first started my internship, I felt nervous and inexperienced. However, through the connections I’ve made and the skills I’ve developed, I definitely feel a lot more comfortable and capable. My internship experience has provided me with many different opportunities to practice my skills and learn from my experiences. Moving forward, I will continue to work on more workflows, such as ones that can do facial recognition, as well as create a video to teach others how to build and run a successful workflow. I’m excited to continue to learn more about coding and machine learning and see where my knowledge and skills will take me.

 

Fig. 5: This python script was written to eliminate all special characters and allow lower and uppercase letters to be counted the same.

2,109 views