Note: This same article appears in my median.com account as well
This short article lays emphasis on how to evaluate summaries produced from Text Summarization. The toolkit used here is Fuzzy Rough sets. The reference summary and the system summary are evaluated and compared for similarity using Fuzzy Rough Set based lower similarity and upper similarity. However, this has not been evaluated yet. The evaluation needs comparisons of results with typical ROUGE based recall scores for n-grams. The intuition is basically based on the fact that the computation of lower and upper approximation require more than an n-gram based model. This is much more than n-gram model.
The definition of Fuzzy Rough Set based lower and upper approximation is given as follows:
Definition. A generalized definition of lower and upper approximations of Fuzzy Rough Set, where R be the fuzzy equivalence relation, is as follows:
Let two summary produced be E1 and let the reference summary be R1. The two kinds of similarities are computed:
1. Lower Similarity
2. Upper Similarity
These accounts for how much similar are the system generated summary and the reference gold summary. Compute similarity between the system summary and the reference summary and then compute the ROUGE scores, and see the correlation and similarity between scores.
This was the guideline for your Research Exercise, which can be taken as a AI Exercise as well.