libvirt/build-aux/check-spacing.pl

199 lines
6.0 KiB
Perl
Raw Permalink Normal View History

#!/usr/bin/env perl
Document bracket whitespace rules & add syntax-check rule This documents the following whitespace rules if(foo) // Bad if (foo) // Good int foo (int wizz) // Bad int foo(int wizz) // Good bar = foo (wizz); // Bad bar = foo(wizz); // Good typedef int (*foo) (int wizz); // Bad typedef int (*foo)(int wizz); // Good int foo( int wizz ); // Bad int foo(int wizz); // Good There is a syntax-check rule extension to validate all these rules. Checking for 'function (...args...)' is quite difficult since it needs to ignore valid usage with keywords like 'if (...test...)' and while/for/switch. It must also ignore source comments and quoted strings. It is not possible todo this with a simple regex in the normal syntax-check style. So a short Perl script is created instead to analyse the source. In practice this works well enough. The only thing it can't cope with is multi-line quoted strings of the form "start of string\ more lines\ more line\ the end" but this can and should be written as "start of string" "more lines" "more line" "the end" with this simple change, the bracket checking script does not have any false positives across libvirt source, provided it is only run against .c files. It is not practical to run it against .h files, since those use whitespace extensively to get alignment (though this is somewhat inconsistent and could arguably be fixed). The only limitation is that it cannot detect a violation where the first arg starts with a '*', eg foo(*wizz); since this generates too many false positives on function typedefs which can't be supressed efficiently. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-17 10:25:35 +01:00
#
# check-spacing.pl: Report any usage of 'function (..args..)'
# Also check for other syntax issues, such as correct use of ';'
Document bracket whitespace rules & add syntax-check rule This documents the following whitespace rules if(foo) // Bad if (foo) // Good int foo (int wizz) // Bad int foo(int wizz) // Good bar = foo (wizz); // Bad bar = foo(wizz); // Good typedef int (*foo) (int wizz); // Bad typedef int (*foo)(int wizz); // Good int foo( int wizz ); // Bad int foo(int wizz); // Good There is a syntax-check rule extension to validate all these rules. Checking for 'function (...args...)' is quite difficult since it needs to ignore valid usage with keywords like 'if (...test...)' and while/for/switch. It must also ignore source comments and quoted strings. It is not possible todo this with a simple regex in the normal syntax-check style. So a short Perl script is created instead to analyse the source. In practice this works well enough. The only thing it can't cope with is multi-line quoted strings of the form "start of string\ more lines\ more line\ the end" but this can and should be written as "start of string" "more lines" "more line" "the end" with this simple change, the bracket checking script does not have any false positives across libvirt source, provided it is only run against .c files. It is not practical to run it against .h files, since those use whitespace extensively to get alignment (though this is somewhat inconsistent and could arguably be fixed). The only limitation is that it cannot detect a violation where the first arg starts with a '*', eg foo(*wizz); since this generates too many false positives on function typedefs which can't be supressed efficiently. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-17 10:25:35 +01:00
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library. If not, see
# <http://www.gnu.org/licenses/>.
use strict;
use warnings;
my $ret = 0;
my $incomment = 0;
foreach my $file (@ARGV) {
# Per-file variables for multiline Curly Bracket (cb_) check
my $cb_linenum = 0;
my $cb_code = "";
my $cb_scolon = 0;
open FILE, $file;
while (defined (my $line = <FILE>)) {
my $data = $line;
# For temporary modifications
my $tmpdata;
# Kill any quoted , ; = or "
$data =~ s/'[";,=]'/'X'/g;
# Kill any quoted strings
$data =~ s,"(?:[^\\\"]|\\.)*","XXX",g;
next if $data =~ /^#/;
# Kill contents of multi-line comments
# and detect end of multi-line comments
if ($incomment) {
if ($data =~ m,\*/,) {
$incomment = 0;
$data =~ s,^.*\*/,*/,;
} else {
$data = "";
}
}
# Kill single line comments, and detect
# start of multi-line comments
if ($data =~ m,/\*.*\*/,) {
$data =~ s,/\*.*\*/,/* */,;
} elsif ($data =~ m,/\*,) {
$incomment = 1;
$data =~ s,/\*.*,/*,;
}
# We need to match things like
#
# int foo (int bar, bool wizz);
# foo (bar, wizz);
#
# but not match things like:
#
# typedef int (*foo)(bar wizz)
#
# we can't do this (efficiently) without
# missing things like
#
# foo (*bar, wizz);
#
# We also don't want to spoil the $data so it can be used
# later on.
$tmpdata = $data;
while ($tmpdata =~ /(\w+)\s\((?!\*)/) {
my $kw = $1;
# Allow space after keywords only
if ($kw =~ /^(?:if|for|while|switch|return)$/) {
$tmpdata =~ s/(?:$kw\s\()/XXX(/;
} else {
print "Whitespace after non-keyword:\n";
print "$file:$.: $line";
$ret = 1;
last;
}
}
# Require whitespace immediately after keywords
if ($data =~ /\b(?:if|for|while|switch|return)\(/) {
print "No whitespace after keyword:\n";
print "$file:$.: $line";
$ret = 1;
}
# Forbid whitespace between )( of a function typedef
if ($data =~ /\(\*\w+\)\s+\(/) {
print "Whitespace between ')' and '(':\n";
print "$file:$.: $line";
$ret = 1;
}
# Forbid whitespace following ( or prior to )
# but allow whitespace before ) on a single line
# (optionally followed by a semicolon)
if (($data =~ /\s\)/ && not $data =~ /^\s+\);?$/) ||
$data =~ /\((?!$)\s/) {
print "Whitespace after '(' or before ')':\n";
print "$file:$.: $line";
$ret = 1;
}
# Forbid whitespace before ";" or ",". Things like below are allowed:
#
# 1) The expression is empty for "for" loop. E.g.
# for (i = 0; ; i++)
#
# 2) An empty statement. E.g.
# while (write(statuswrite, &status, 1) == -1 &&
# errno == EINTR)
# ;
#
if ($data =~ /\s[;,]/) {
unless ($data =~ /\S; ; / ||
$data =~ /^\s+;/) {
print "Whitespace before semicolon or comma:\n";
print "$file:$.: $line";
$ret = 1;
}
}
# Require EOL, macro line continuation, or whitespace after ";".
# Allow "for (;;)" as an exception.
if ($data =~ /;[^ \\\n;)]/) {
print "Invalid character after semicolon:\n";
print "$file:$.: $line";
$ret = 1;
}
# Require EOL, space, or enum/struct end after comma.
if ($data =~ /,[^ \\\n)}]/) {
print "Invalid character after comma:\n";
print "$file:$.: $line";
$ret = 1;
}
Document bracket whitespace rules & add syntax-check rule This documents the following whitespace rules if(foo) // Bad if (foo) // Good int foo (int wizz) // Bad int foo(int wizz) // Good bar = foo (wizz); // Bad bar = foo(wizz); // Good typedef int (*foo) (int wizz); // Bad typedef int (*foo)(int wizz); // Good int foo( int wizz ); // Bad int foo(int wizz); // Good There is a syntax-check rule extension to validate all these rules. Checking for 'function (...args...)' is quite difficult since it needs to ignore valid usage with keywords like 'if (...test...)' and while/for/switch. It must also ignore source comments and quoted strings. It is not possible todo this with a simple regex in the normal syntax-check style. So a short Perl script is created instead to analyse the source. In practice this works well enough. The only thing it can't cope with is multi-line quoted strings of the form "start of string\ more lines\ more line\ the end" but this can and should be written as "start of string" "more lines" "more line" "the end" with this simple change, the bracket checking script does not have any false positives across libvirt source, provided it is only run against .c files. It is not practical to run it against .h files, since those use whitespace extensively to get alignment (though this is somewhat inconsistent and could arguably be fixed). The only limitation is that it cannot detect a violation where the first arg starts with a '*', eg foo(*wizz); since this generates too many false positives on function typedefs which can't be supressed efficiently. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-17 10:25:35 +01:00
# Require spaces around assignment '=', compounds and '=='
if ($data =~ /[^ ]\b[!<>&|\-+*\/%\^=]?=/ ||
$data =~ /=[^= \\\n]/) {
print "Spacing around '=' or '==':\n";
print "$file:$.: $line";
$ret = 1;
}
Document bracket whitespace rules & add syntax-check rule This documents the following whitespace rules if(foo) // Bad if (foo) // Good int foo (int wizz) // Bad int foo(int wizz) // Good bar = foo (wizz); // Bad bar = foo(wizz); // Good typedef int (*foo) (int wizz); // Bad typedef int (*foo)(int wizz); // Good int foo( int wizz ); // Bad int foo(int wizz); // Good There is a syntax-check rule extension to validate all these rules. Checking for 'function (...args...)' is quite difficult since it needs to ignore valid usage with keywords like 'if (...test...)' and while/for/switch. It must also ignore source comments and quoted strings. It is not possible todo this with a simple regex in the normal syntax-check style. So a short Perl script is created instead to analyse the source. In practice this works well enough. The only thing it can't cope with is multi-line quoted strings of the form "start of string\ more lines\ more line\ the end" but this can and should be written as "start of string" "more lines" "more line" "the end" with this simple change, the bracket checking script does not have any false positives across libvirt source, provided it is only run against .c files. It is not practical to run it against .h files, since those use whitespace extensively to get alignment (though this is somewhat inconsistent and could arguably be fixed). The only limitation is that it cannot detect a violation where the first arg starts with a '*', eg foo(*wizz); since this generates too many false positives on function typedefs which can't be supressed efficiently. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-17 10:25:35 +01:00
# One line conditional statements with one line bodies should
# not use curly brackets.
if ($data =~ /^\s*(if|while|for)\b.*\{$/) {
$cb_linenum = $.;
$cb_code = $line;
$cb_scolon = 0;
}
Document bracket whitespace rules & add syntax-check rule This documents the following whitespace rules if(foo) // Bad if (foo) // Good int foo (int wizz) // Bad int foo(int wizz) // Good bar = foo (wizz); // Bad bar = foo(wizz); // Good typedef int (*foo) (int wizz); // Bad typedef int (*foo)(int wizz); // Good int foo( int wizz ); // Bad int foo(int wizz); // Good There is a syntax-check rule extension to validate all these rules. Checking for 'function (...args...)' is quite difficult since it needs to ignore valid usage with keywords like 'if (...test...)' and while/for/switch. It must also ignore source comments and quoted strings. It is not possible todo this with a simple regex in the normal syntax-check style. So a short Perl script is created instead to analyse the source. In practice this works well enough. The only thing it can't cope with is multi-line quoted strings of the form "start of string\ more lines\ more line\ the end" but this can and should be written as "start of string" "more lines" "more line" "the end" with this simple change, the bracket checking script does not have any false positives across libvirt source, provided it is only run against .c files. It is not practical to run it against .h files, since those use whitespace extensively to get alignment (though this is somewhat inconsistent and could arguably be fixed). The only limitation is that it cannot detect a violation where the first arg starts with a '*', eg foo(*wizz); since this generates too many false positives on function typedefs which can't be supressed efficiently. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-17 10:25:35 +01:00
# We need to check for exactly one semicolon inside the body,
# because empty statements (e.g. with comment only) are
# allowed
if ($cb_linenum == $. - 1 && $data =~ /^[^;]*;[^;]*$/) {
$cb_code .= $line;
$cb_scolon = 1;
}
if ($data =~ /^\s*}\s*$/ &&
$cb_linenum == $. - 2 &&
$cb_scolon) {
print "Curly brackets around single-line body:\n";
print "$file:$cb_linenum-$.:\n$cb_code$line";
$ret = 1;
# There _should_ be no need to reset the values; but to
# keep my inner peace...
$cb_linenum = 0;
$cb_scolon = 0;
$cb_code = "";
}
Document bracket whitespace rules & add syntax-check rule This documents the following whitespace rules if(foo) // Bad if (foo) // Good int foo (int wizz) // Bad int foo(int wizz) // Good bar = foo (wizz); // Bad bar = foo(wizz); // Good typedef int (*foo) (int wizz); // Bad typedef int (*foo)(int wizz); // Good int foo( int wizz ); // Bad int foo(int wizz); // Good There is a syntax-check rule extension to validate all these rules. Checking for 'function (...args...)' is quite difficult since it needs to ignore valid usage with keywords like 'if (...test...)' and while/for/switch. It must also ignore source comments and quoted strings. It is not possible todo this with a simple regex in the normal syntax-check style. So a short Perl script is created instead to analyse the source. In practice this works well enough. The only thing it can't cope with is multi-line quoted strings of the form "start of string\ more lines\ more line\ the end" but this can and should be written as "start of string" "more lines" "more line" "the end" with this simple change, the bracket checking script does not have any false positives across libvirt source, provided it is only run against .c files. It is not practical to run it against .h files, since those use whitespace extensively to get alignment (though this is somewhat inconsistent and could arguably be fixed). The only limitation is that it cannot detect a violation where the first arg starts with a '*', eg foo(*wizz); since this generates too many false positives on function typedefs which can't be supressed efficiently. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-10-17 10:25:35 +01:00
}
close FILE;
}
exit $ret;