Emerging Trends in Software Engineering

University of Oulu
Home Noppa 811600S >  Exercises


The course is not in progress or the course does not use Noppa. Contents of this page may be out of date.


Length of the document  Suggested length 10-30 pages. The number of figures greatly affects the report length.




Start to think about your own topic


Report structure

Here are the headings and some suggestion on the content to the group work report. 

1. Introduction

- What is this topic about (The reader might not be aware of what is e.g. Sentiment Analysis, or Cryptocurrencies)
-Why is this topic important (globally)?
-Why is this topic important (to your group)?

2. Search Strings

- What search strings were used?
- Are the search string the same for all data sources?
- What was the process to form a correct search string?
- After running all analysis are you able to find even a better search string

3. Stopwords

- Report a list of stop words you used (please exclude common English stop words and report only your context specific stop words, for example “software” could be a stop word in some topics but not in others)
- Explain the process of coming up with correct stop words.

4. Dendogram and Hierarchical cluster

- Show your Dendogram cluster figure (or a part of it if it is too big)
- Explain what is in the figure
- Did you find this technique useful?

5. Word clouds

- Show word clouds and comparison word clouds from all 3 data sources (Scopus, Twitter, StackOverflow)
- Report what are difference between sources, between old and new sources, between popular (top cited/voted/retweeted) sources
- Did you find this technique useful?

6. Top-5 Popular sources

- List and explain the top-5 sources for each data of the 3 data sources. What topics are the top-5 source about what the commonalities and differences in them.
- Did you find this technique useful?

7. Interactive LDA clustering

- Study the interactive LDA clusters. Take some screenshots and report some finding 3-5 that you find interesting
- Did you find this technique useful?

8. Hot and cold topics (with LDA clustering)

- Show the hot and cold topics by graphs and by listing top-10 terms
- Did you find this technique useful?

9 . Discussion and conclusions

Summarize and discuss your findings. What are the most notable findings? What trend mining techniques did you find the most beneficial? Are there some techniques that could be added? Are there interesting directions for future research? Is there a call for action (now what)?

Exercise material

Scripts for the Trend Mining Exercises
DTM Script
General Introduction to R from Tilastollisen data-analyysin perusteet tietojenkäsittelytieteilijöille, 5 op
Example material (Note: This outdated. Only includes Scopus data. Also author analysis is no longer recommended)

Single Exercises

Printable version
Updated 28 Sep 17 at 12:18

University of Oulu oulun.yliopisto(at)oulu.fi
Tel. +358 294 48 0000
Fax +358 8 553 4112
PL 8000
FI-90014 Oulun yliopisto