Machine learning to help researchers evaluate biases in clinical trials: a prospective, randomized user study

Table 6 Illustrative examples of participant qualitative feedback, stratified by SUS score category

Qualitative Feedback	SUS score from user
SUS scores > = 68 (i.e. rated usability as above average)
We should have 3 risk choices (as it is for Cochrane review): low, unclear and high risk, and not only low versus unclear/high.	97.5
think the suggested text is good, so long as it doesn’t make people lazy.	92.5
The two blinding questions are confusing because the form did not specify which outcome that I am supposed to assess.	90
I had a hard time finding the green highlighted information\dots this colour was hard for me to see	87.5
I think it would be helpful to have a pop-up window that would have the explanation for each of the risk of bias questions.	77.5
SUS scores < 68 (i.e. rated usability as below average)
Technical problems with highlighting occurred.	67.5
Risk that reviewers would only focus on suggested text, not full text.	65
What was the order of the annotated text when multiple pieces of text were highlighted? Is it sorted by text order or relevance? I hope it’s by relevance!	57.5
The text suggestions usually had at least one relevant sentence (maybe 1 out of 3) for rating so it meant I was unclicking things to be accurate.	57.5
I found myself searching for an “undo” button when I deleted a suggested text spot, then changed my mind.	55

ISSN: 1472-6947