Gain research experience in data science applied to wicked software engineering problems.
Find Out More Apply Now!The Science of Software REU Site at NC State University immerses a diverse group of undergraduates in a vibrant research community working on data science and software engineering. Research projects include visualization and data manipulation in virtual reality, model-based reasoning, human aspects (e.g., eye tracking data), bug fixing, and software analytics which is a cross between software and data mining methods.
Students will work besides research faculty to gain hands on experience with data science skills: machine learning, data engineering, statistics, and studying developer behavior through observation, interviews, surveys, biometrics, and data collection. REU students present their work at an undergraduate research competition, develop videos to convey their research to K-12 students, and are encouraged to submit their technical papers as peer reviewed publications.
Understanding Regular Expression Understandability. Most programmers use regular expressions when they code, and those same programmers often complain about them, too! Have you ever wondered why regular expressions are so hard to read and write? The goals of this project are to 1) explore the learning barriers programmers encounter when reading, writing, and fixing regular expressions, and 2) develop techniques to automatically repair broken regular expressions. You will get to learn about the basics of source code mining, source code analysis, and explore how programmers learn.
Crowdsourcing Quality. Crowdsourcing has become a common approach for companies, researchers, and individuals to acquire the opinions of hundreds, if not thousands of humans for very little money. However, current platforms fall short in their abilities to coerce the crowd into producing quality results. In this work, we will explore and evaluate techniques for crowd control when obtaining opinions on software models.
Reading is hard. Software engineers, researchers, spend thousands of hours a year pouring through documentation and research papers. We seek intelligent librarians to assist humans in navigating all that textual data.
In this project we explore and refactor the state-of-the-art text mining from both evidence-based medicine and legal electronic discovery to better support software engineers and software engineering researchers. Our current state-of-the-art is that we can "read" 10,000 papers by skimming just few dozen, then asking text miners to go find other potentially relevant papers. But those methods are very preliminary and much further work is required before we can ask a wider audience to use our tools. We have a software laboratory, called MAR (machine-assisted reading) in which we can perform extensive experiments on better ways to help people explore large sets of documents. MAR is not really "one" tool, rather it is a place where we can experiment with many tools. Think of it as a platform on top of which you can rapidly explore text mining methods.
So want to learn a lot about text mining and scientific experimentation? Then sign up for this project. For more on this work, see
Developer Adaptive Modeling. Most software engineering tools assume that all software developers are the same. Bug trackers, code review tools, and static analysis tools look the same to every developer, even though some developers know more and some developers know less about the concepts those tools convey. In this project, we'll explore how those tools can adapt automatically to the developer looking at them. (link: http://people.engr.ncsu.edu/ermurph3/papers/fse15-nier-brittany.pdf)
Gender Bias in Software Development. The creation of software is a nobel and rewarding persuit, but like all human endavors, those who participate are not immune to harmful biases. One of these biases is gender bias. Building on our prior work that looked at gender bias during pull requests on GitHub, in this project we'll explore other aspects of gender bias, including how people talk to one another and participant resliency. (link: peerj.com/preprints/1733/)
Error messages. Error messages produced by software engineering tools, from refactoring tools to profiling tools, are notoriously obscure. While these messages have improved significantly in the last 30 years, novices continue to find them difficult to understand, while experts find them obtuse. In this project we'll seek to better understand the challenges developers have, as well as design new messages that solve these challenges. (link: http://people.engr.ncsu.edu/ermurph3/papers/ICSE14_NIER.pdf)
The projects you can work on are not limited to the ones above. You may find opportunities to contribute to many other projects during your participation in the program.
Please instruct your reference writers to upload their letter to this site. They will be required to have a Google account to upload their letter; if they do not have a Google account, they may email their letter to emerson@csc.ncsu.edu
Thinking about joining with us? That's great! Give us a call or send us an email and we will answer any questions you have.
503.545.5312