Ask Reuben

Code Quality

Is there a tool to scan 4gl code for vulnerabilities / security issues ?

Is there a SonarQube plugin for Genero ?

What is gslint ?

In recent years there has seen an increase in questions around code quality, code vulnerabilities, and code security.

Many years ago before I joined Four Js, I implemented some code quality tests in our development environment.  Every night a number of scripts would run that would check out the latest version of the source, and then run tests to make sure that all checked in code adhered to our coding standards.   In the morning when I arrived at my desk, there would be an automated email waiting for me with a list of any violations of our coding standards.  I would then visit the developer who worked on the task, and also the person who manually code reviewed the task, and get the code modified so that it conformed to our coding standards.

Predominantly these scripts utilised grep to find patterns of code.   A simplified example, if we wanted all usage of ERROR to pass through a library call, the scripts might have a line like grep -ic " ERROR " *.4gl | grep -v ":0" | grep -v "lib_dialogs.4gl"  .  This would first count the number of times “ERROR ” occurred in each .4gl, and then it would exclude any files where the count was zero, and it would exclude our library that did have the function that contained ERROR, leaving us with the .4gl that contained ERROR that shouldn’t have.

For the most part these worked well.  As we had control over it we could continually refine it to catch cases we missed, or rule out false positives.  We even had code standards just to help this test process, for example each entry in the ATTRIBUTES section of a form had to be on one line, it could not wrap onto a second line just so that simple grep’s could be used. Similarly adherence to case rules and only using one side of an alias.i.e INT vs INTEGER simplified the tests.

The world has moved on from those days, and with Buy Before You Build, developers are looking for third party tools to measure the quality of your code and to identify vulnerabilities.  A Swiss company SonarSource has done very well in this area with their products including Sonarqube, an open core product for static code analysis, and SonarLint an IDE extension for static analysis.

Sonarqube has plugins for a number of  programming languages (at the time of writing 29) and we get asked about Sonarqube from those companies who also use one of these other languages in their development environments.  Sonarqube also has the facility to create plugins for other languages.  If there was commercial interest in a Sonarqube plugin to de developed for Genero, then that would be considered and you can put that through your account / support contact.

The other name there, SonarLint you may think that sounds familiar and that is because in Genero 3.20 we introduced a product to some code quality.  That uses a new command called gslint and has various configuration files, environment variables that all use lint in the name.  This is documented in the Studio configuration under Code Quality.  You may also have noticed that the gslint command was being executed to do code quality checks as part of the build, you can see this in the Build Rules …

I’ll let you refer to the documentation on howto configure the Studio Linter tools to perform certain tests.  What I want to do is take you behind the scenes as to how it works as I believe to get maximum benefit of this tool you are going to want to add extra tests over and above what we initially supply.  Do the following …

  1. In the Genero Studio install, find the tools/gslint directory.
  2. In there is a command, gslint.sh (and gslint.bat on Windows) which is the command being run by Genero Studio as part of the build rules.  You will see that it is a Genero application.
  3. The src folder contains the source code.
  4. Find gslintanalyse.4gl
  5. Find the function extractASTFromFile
  6. Note that this function executes fglcomp -Tx commandLineOptions Path.  This is the important line.

The AST in the function name refers to Abstract Syntax Tree.  fglcomp -Tx outputs an Abstract Syntax Tree.  An Abstract Syntax Tree can be used in the analysis of code.  For example, if we look in the Sonarqube plugin for Python we see it produces and uses an Abstract Syntax Tree.

The Python analyzer parses the source code, creates an Abstract Syntax Tree (AST) and then walks through the entire tree. A coding rule is a visitor that is able to visit nodes from this AST.

As soon as the coding rule visits a node, it can navigate its children and log issues if necessary.

To see an Abstract Syntax Tree for Genero, write a small hello world program

MAIN
   DISPLAY "Hello World"
END MAIN

and run fglcomp -Tx helloworld.4gl. You will get the following output …

<symbol location="helloworld.4gl:1.1-3.8" name="module">
  <symbol location="helloworld.4gl:1.2-1.1" name="compilerOptions"/>
  <symbol location="helloworld.4gl:1.1-1.0" name="optional"/>
  <symbol location="helloworld.4gl:1.2-1.1" name="importL"/>
  <symbol location="helloworld.4gl:1.6-1.5" name="varDefStmtL"/>
  <symbol location="helloworld.4gl:1.1-3.8" name="functionDeclL">
    <symbol location="helloworld.4gl:1.1-3.8" name="main">
      <symbol location="helloworld.4gl:1.6-1.5" name="defineL"/>
      <symbol location="helloworld.4gl:2.5-2.25" name="fglStmtL">
        <symbol location="helloworld.4gl:2.5-2.25" name="display">
          <symbol location="helloworld.4gl:2.13-2.25" name="exprCl">
            <symbol location="helloworld.4gl:2.13-2.25" name="exprCharConst">
              <token location="helloworld.4gl:2.13-2.25" name="CharConst" value="Hello World"/>
            </symbol>
          </symbol>
        </symbol>
      </symbol>
      <symbol location="helloworld.4gl:3.1-3.8" name="endFunction"/>
    </symbol>
  </symbol>
</symbol>

… you will see an XML document.  What gslint does is produce this XML document in memory, and traverses it applying tests to the various nodes and attributes to see that they meet rules.

If we look at an example such as a rule that decimal variables must have precision.  That is with DECIMAL variables we want the developer to be explicit and state DECIMAL(12,0) rather than DECIMAL(12), and to state DECIMAL(16,0) rather than DECIMAL etc.  If we construct a little test …

MAIN
    DEFINE x DECIMAL
    DEFINE y DECIMAL(12)
    DEFINE z DECIMAL(12, 2)
END MAIN

… and run fglcomp -Tx on this file the key bits on the output are …

...
              <symbol location="decimal.4gl:2.12-2.12" name="identCl">
                <token location="decimal.4gl:2.12-2.12" name="Ident" value="x"/>
              </symbol>
              <symbol location="decimal.4gl:2.14-2.20" name="typeDecimal">
                
           ...
              <symbol location="decimal.4gl:3.12-3.12" name="identCl">
                <token location="decimal.4gl:3.12-3.12" name="Ident" value="y"/>
              </symbol>
              <symbol location="decimal.4gl:3.14-3.24" name="typeDecimal">
                <symbol location="decimal.4gl:3.21-3.24" name="precScale">
                  <token location="decimal.4gl:3.22-3.23" name="IntegerConst" value="12"/> 
            ...
              <symbol location="decimal.4gl:4.12-4.12" name="identCl">
                <token location="decimal.4gl:4.12-4.12" name="Ident" value="z"/>
              </symbol>
              <symbol location="decimal.4gl:4.14-4.27" name="typeDecimal">
                <symbol location="decimal.4gl:4.21-4.27" name="precScale">
                  <token location="decimal.4gl:4.22-4.23" name="IntegerConst" value="12"/>
                  <token location="decimal.4gl:4.26-4.26" name="IntegerConst" value="2"/>
                ...

… the key bit of  function checkDecimalPrecision() in gslintanalyse.4gl is …

LET precisionScaleNodes = node.selectByPath("//symbol[@name='precScale']//token[@name='IntegerConst']")
IF precisionScaleNodes.getLength() < 2 THEN

… that is the AbstractSyntax Tree (AST) is being examined to make sure that there are two precScale/IntegerConst nodes being found.  If there is 0 or 1 then the DECIMAL variable has not been defined with the required permission.

So the various rules in gslint are doing something similar, and if we were to construct a Sonarqube plugin that would no doubt involve doing something similar.

If you were looking for something similar for forms, a .42f and a .4fd are already XML documents.  You can do similar analysis using XML methods on these files as is, you do not necessarily need to create an AST.

Final point I wish to make is when it comes to detecting vulnerabilities, these tools are not silver bullets.  Starting with …

DEFINE x STRING
INPUT x ...

… this can be passed untouched into any of the following lines and be considered unsafe …

RUN x
EXECUTE IMMEDIATE x
CALL ui.Interface.frontCall("standard","execute",[x],[result])
DISPLAY x TO fieldname
DECLARE c CURSOR FROM x

… no tool will automatically find such combinations.   It is still important to have good coding practices, manual code reviews.