Code formatting in the forum

9 views (last 30 days)
Jan
Jan on 4 Apr 2013
Answered: Jan on 4 May 2015
Although this forum is online in the 3rd year now and thousands of examples can be found, it is still a tedious task to suggest beginners to format their code. The experienced contributors have explained the procedure thousands of times, and less than a hand full of the beginners found the time to thank them for this.
The problem has been mentioned exhaustively in the wish-list already. It shouldn't be complicated to solve this problem by adding explicit instructions for the first 5 times users post a question. Obviously neither the "{} code" nor the "? Help" button encourage people to learn the basics in the forum. But I'd hope that they spend the time to read text instructions like:
Formatted code is a core feature of this forum. Insert a blank line before and after the code and start each line with at least 2 spaces.
Follow the "? help" button to learn more.
And when this message disappears after the 5th posting, it could even get a red background and some flashing effects.
This would be much more efficient than letting the editors and other diligent users do this ungrateful job.
  5 Comments
Evan
Evan on 6 Aug 2013
I agree that more voting would better utilize the whole point of the "reputation" system. It seems like the community on other help-forums use voting much less sparingly, while here 0 or 1 is the most common score for even excellent answers.
Oftentimes, I notice that a user has submitted a very detailed and on-point response to someone's question and think to myself its a shame that the answer was never accepted. It's only lately that I'm realizing that, even though I'm not the OP, I actually have the ability to at least give the author some sort of feedback/credit for their effort.
Jan
Jan on 15 Aug 2013
@Evan: You can accept a question of another contributor above a limit of 500 reputation points, when the author of a question did not select one for 7 days. Currently 48 users have reached this level. But frequent voting will increase the number.

Sign in to comment.

Accepted Answer

Cedric
Cedric on 6 Aug 2013
Edited: Cedric on 8 Aug 2013
EDIT @ 4:30pm EST: strfind -> regexp with neg. look behind for avoind matching nbsp;.
Here is a simple crawler. It is not my original idea, which was a mechanism at Mathworks level and not at a user (one of us) level. I implemented a few criteria which are not those listed above, as the crawler has to work with content that was already parsed and "preformatted" by the forum.
The criteria implemented should be improved. Typically, the function call(s)/def(s) detection is too "simple" and generates false positive when users write function names followed by parentheses in normal text.
Anyhow, this is just a simple demo.
The whole code below (both functions) should be saved in forumCrawler.m, and you can set pageDepth to control how many forum pages you want to process.
----------------------------------------------------------------------------------------------------------------
function forumCrawler
pageDepth = 1 ;
baseURL = 'http://www.mathworks.com' ;
for pageId = 1 : pageDepth
fprintf('\n=== Processing page %d..\n', pageId) ;
url = sprintf('%s/matlabcentral/answers/?page=%d', baseURL, pageId) ;
thread = regexp(urlread(url), '(?<=<h3><).*?(?=")', 'match') ;
nThread = length(thread) ;
for tId = 1 : nThread
fprintf(' - Analyzing thread %d/%d..\n', tId, nThread) ;
url = sprintf('%s%s', baseURL, thread{tId}) ;
htmlBuffer = urlread(url) ;
% - Scan question.
question = regexp(htmlBuffer, ...
'(?<=class="question-body ).*?(?=</div>)', 'match') ;
[tf, msg] = isLikelyUnformatted(question{1}) ;
if tf
fprintf(' [<a href="%s">question>] %s.\n', url, msg) ;
end
% - Scan answers.
answer = regexp(htmlBuffer, ...
'<div id="([^"]+)" class="answer-body">(.*?)</div>', 'tokens') ;
for cId = 1 : length(answer)
[tf, msg] = isLikelyUnformatted(answer{cId}{2}) ;
if tf
answerUrl = sprintf('%s#%s', url, answer{cId}{1}) ;
fprintf(' [<%s answer> ] %s.\n', ...
answerUrl, msg) ;
end
end
% - Scan comments.
comment = regexp(htmlBuffer, ...
'<div id="([^"]+)" class="comment-body">(.*?)</div>', 'tokens') ;
for cId = 1 : length(comment)
[tf, msg] = isLikelyUnformatted(comment{cId}{2}) ;
if tf
commentUrl = sprintf('%s#%s', url, comment{cId}{1}) ;
fprintf(' [<%s comment> ] %s.\n', ...
commentUrl, msg) ;
end
end
end
end
end
function [tf, msg] = isLikelyUnformatted(content)
tf = true ;
% Eliminate content within <pre>.. and |..| tags,
% so we work on what is meant to be text.
buffer = regexp(content, '
', 'split') ;
content = [buffer{:}] ;
buffer = regexp(content, '<tt.*?</tt>', 'split') ;
content = [buffer{:}] ;
% Check for a few indicators.
if ~isempty(regexp(content, '\w:\w', 'ONCE'))
msg = 'range def. found' ; return ; end
if ~isempty(regexp(content, '\w(', 'ONCE'))
msg = 'function call(s)/def(s) found' ; return ; end
if ~isempty(regexp(content, '(?<!nbsp);</p>', 'ONCE'))
msg = '";</p>" found' ; return ; end
tf = false ;
msg = '' ;
end
  4 Comments
Evan
Evan on 8 Aug 2013
Edited: Evan on 8 Aug 2013
This is a really slick little function. And if something similar were implemented on TMW's end, even false positives would be pretty harmless. I think we'd still end up with people neglecting formatting (after all, nowadays popup dialogs and warning messages are either 1) meant to be ignored or 2) an exercise for honing your ability to quickly close windows). Still, it's a simple enough feature that it's worth having.
Cedric
Cedric on 8 Aug 2013
I thought about it a little more and, somehow, I wouldn't mind having automatically an intermediary page when we submit a question (not for comments or answers, but for questions only) with a big read message reminding about formatting and displaying the post as a preview. We don't post that many questions finally, so it wouldn't be annoying.
I think that this mechanism is light enough so it wouldn't take Mathworks that much time/work to implement.

Sign in to comment.

More Answers (5)

Jan
Jan on 6 Aug 2013
Bump.
It is really tedious to remind so many newcomers in the forum to format their questions. But ignoring the questions due to the lack of readability would reduce the quality of the forum.
Is there really no idea how newcomers could be motivated to apply a proper formatting?
  3 Comments
Jan
Jan on 6 Aug 2013
Edited: Jan on 6 Aug 2013
@Cedric: I have the impression that you are able to write a function, which recognized a missing formatting. Matlab's urlread can grab the contents from the forum automatically. But I hesitate to apply an auto-formatting of messages of foreign people remote controlled by my local Matlab. But adding a comment with the usual suggestions would be not dangerous.
I'd still prefer that TMW implements this to let us concentrate on the quality of the answers. Thunderbird (p)recognized, when I want to attach a file. Amazon offers opinions about what I want to buy. And if TMW guesses that all questions contain code, the false classification rate will be below 50%. So if a beginner (less than 4 questions) starts to type in any character, a popup appears telling "If you want to insert code...". I hate popups, but this could be a valid application. Hm, I even let my browser suppress popups. Perhaps the idea is not working in reality.
Cedric
Cedric on 6 Aug 2013
Edited: Cedric on 6 Aug 2013
@Jan: I meant at Mathworks level, in PHP or whatever language they are using, they could implement a detection based on this list of criteria and display a warning if needed. These criteria would certainly catch most cases where there is unformatted code (and we don't need 100% accuracy), and their implementation is a matter of building a few regular expressions.
Also, this mechanism wouldn't prevent a user to submit an answer/comment, but just add a warning page which would display a red/big message warning that some unformatted code seemed to be detected and asking the user to either go back, or confirm that he/she wants to post the current content.
That said, if it presents any interest, I am probably able to build a MATLAB-based crawler which detects threads with unformatted code based on the aforementioned list of criteria, yes.

Sign in to comment.


Iain
Iain on 6 Aug 2013
Why not have two textboxes, one for text, and one for code?
  1 Comment
Cedric
Cedric on 6 Aug 2013
Edited: Cedric on 6 Aug 2013
We often mix code with text actually.
I you look at this or this, I wouldn't have been able to manage these answers+comments with split text/code.

Sign in to comment.


Evan
Evan on 6 Aug 2013
Edited: Evan on 6 Aug 2013
Is there any way to have two levels of permissions for editing another user's question? At the moment, assuming there are no users who have been granted privileges prematurely, there are only 15 users capable of editing a question. I would say 50% or less of these users have been very active on these forums over the past month or so.
I understand that editing another user's question is a privilege that has potential for abuse and should therefore be difficult to obtain, but if it were possible to split the permissions in some manner, allowing users with, say, a reputation of 750 or 1000 to use the "format code" feature without modifying the text of a question, would it be worth the effort?
Perhaps its cynical, but I think that it's going to be near impossible to get new posters to adhere to the standards for formatting. We can put up announcements, add brightly colored textboxes to the "new question" page, and even flag certain keywords, but unless we actually are making it impossible to submit a post unless you've formatted those flagged keywords, people will continue submitting giant walls of unformatted code.
And not to hijack this topic, but another feature I would like to see is the ability to move comments and answers for those cases where users don't catch on to the differences between them.
  2 Comments
Cedric
Cedric on 6 Aug 2013
Edited: Cedric on 6 Aug 2013
This is related to the "janitor" type of work mentioned here in my answer at the bottom (copied below).
" " "
I've seen Walter mentioning "janitor" type of work on the forum, and I think that a 500 rep. should allow people to do this kind of work actually, if they have time and energy for this (and if they are trusted; I'll develop this below). It is obviously tricky to give enough privileges to perform janitor work without giving all privileges, but it is certainly worth working on finding a solution.. in the sense that currently you have to be a high rep. member to spend your time on e.g. formatting questions instead of answering them (..).
Jan posted lately a question about formatting and I commented mentioning "trustees". I think that it is meaningful in the sense that active people in the top 10 rep. know roughly who is answering questions and have an idea about the quality of the answers; in other words, I think that privileges would be better distributed by a mechanism involving rep. points but more importantly a sponsorship/trustee mechanism involving these top 10 rep. active people.
Mixing this idea and the "janitor" type of work mentioned above, I believe that it would be quite interesting if members hitting 500 rep. points, and defined as trustees by top 10 members, would get a limited privilege for editing questions (maybe more interesting than giving a privilege for accepting answers). To illustrate, a logic could be:
  • Rep. points provide recognition as they should, but no privilege. These are separate aspects of the "life" on the forum.
  • Rep. points + the sponsorship/trustee flag provide privileges. E.g. 500 pts + trustee provide "janitor type of work" privileges. People with these privileges are thought to be able to know when/where they are proficient enough to accept answers, and hence have the privilege to accept answers. They can also edit questions without having the full editor privilege, which could be defined as: adding/deleting spaces, underscore, stars, and CR/LF. This would allow performing most of the formatting tasks, without leaving the possibility to change the content (addressing hence Jan's concern in his post mentioned above). It would be relatively easy to implement the check: after removal of these characters in both the original and the modified text, the strings must match.
" " "
Jan
Jan on 6 Aug 2013
Edited: Jan on 6 Aug 2013
In other forums I see BBcode for formatting, e.g. [code]fprintf[/code]. If this would be recognized here also, the >500 rep janitors could be allowed to insert exactly these keys and nothing else. Then marking the text with the mouse and hitting the "{} Code" button would work without the need to open the message for editing.
The number of conflicts e.g. in "function [code] = createCode" is surprisingly small. But of course there is approximately the same number of beginners who omit the formatting also in the forums I'm talking of.

Sign in to comment.


Jan
Jan on 8 Aug 2013
Another simple idea: Some additional buttons are inserted for the editors to insert standard comments: "Please follow the the [? Help] link to learn how to..." and "Please consider the suggestions on [how to ask a good question]...".
In addition it is required to send email notification when a comment is posted also. Perhaps an automatic closing is useful also.
Then the editors still have to struggle with the forgotten formatting, but hitting one button is much less time consuming.

Jan
Jan on 4 May 2015
And the next bump.
The number of questions with unreadable code is not decreasing. The frequent contributors still waste too much time with asking for the application of the "{} Code" format.
Please, TMW, force the newcomers in the forum to read the instructions about code formatting! Remove the gaptcha (if there one), but display 3 lines of code and accept the user only, if he or she marks it with the mouse and press the magic button.

Categories

Find more on Historical Contests in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!