[Owasp-leaders] [Owasp-board] Working toward a resolution on the Constrast Security / OWASP Benchmark fiasco
Kevin W. Wall
kevin.w.wall at gmail.com
Mon Nov 30 18:09:19 UTC 2015
Which is why I didn't suggest anything as radical as Eoin's suggestion.
Some may argue that simply marking the wiki page as "disputed" doesn't go
far enough and they may be right, but IMO opinion it's a start and a step
in the right direction. Let's not let the perfect become the enemy of the
Sent from my Droid; please excuse typos.
On Nov 30, 2015 1:01 PM, "Bill Sempf" <bill at pointweb.net> wrote:
> This is not a simple conversation, that's why there is a lack of movement.
> OWASP can't remove all vendor influence. It's just not possible. I give
> free OWASP training all the time. Does it lead to paid training? Of course
> it does. Does the fact that I lead the .NET project contribute to the
> overall impression that POINT has expertise in .NET security? Of course it
> Beyond that, OWASP can't survive without money. Would Simon still be
> doing the work he does on ZAP if Mozilla wasn't paying him? Of course not.
> We simple need the support of vendors, both as an institution, and
> So there must be a space in which the vendors can move within OWASP. Are
> there not 'rules of use' revolving around projects and vendor neutrality,
> as there are with the chapters and speakers? If there are not, there should
> be. It's out of my realm of expertise, but I am certain there is some
> guideline out there. If there are rules, and if Benchmark is found to be in
> violation, then there must be a penalty. That's all there is to it.
> On Mon, Nov 30, 2015 at 12:33 PM, Eoin Keary <eoin.keary at owasp.org> wrote:
>> Let's not over complicate things....
>> How about co-lead with a non vendor (industry)?
>> "Vendor" is the opposite to "industry".
>> Vendor supplies security products or services. Industry is a consumer of
>> such things.
>> Then again, let's do nothing, that seems to work :)
>> Eoin Keary
>> OWASP Volunteer
>> On 30 Nov 2015, at 5:15 p.m., Josh Sokol <josh.sokol at owasp.org> wrote:
>> I tend to agree with this. It seems that having a vendor lead a project
>> leads to questions about their ability to remain objective. That said, how
>> do we qualify who is and is not a "vendor". Anyone who sells something?
>> Do you have to sell a security product? What about a security service?
>> Does it matter if my project has nothing to do with what I "sell"? I would
>> bet that many of our project leaders are currently working for vendors in
>> the security space and should be removed. What happens if we remove the
>> vendor and nobody wants to step up and take over the project? Just some
>> questions in my mind that I've been pondering around this.
>> On Mon, Nov 30, 2015 at 5:17 AM, Eoin Keary <eoin.keary at owasp.org> wrote:
>>> I don't believe vendors should lead any project.
>>> Contribute? yes, Lead? No.
>>> This goes for all projects and shall help with independence and
>>> Eoin Keary
>>> OWASP Volunteer
>>> On 28 Nov 2015, at 11:39 p.m., Kevin W. Wall <kevin.w.wall at gmail.com>
>>> Until very recently, I've been following at a distance this dispute
>>> various OWASP members and Contrast Security over the latter's advertising
>>> references to the OWASP Benchmark Project
>>> While I too believe that mistakes were made, I believe that we all need
>>> take a step back and not throw out the baby with the bath water.
>>> While unlike Johanna, I have not executed the OWASP Benchmark Project for
>>> any given SAST or DAST tool, having used many such commercial tools, I
>>> qualified to render a reasoned opinion of the OWASP Benchmark Project,
>>> perhaps some steps that we can take towards amicable resolution.
>>> Let me start with the OWASP Benchmark Project. I find the idea of having
>>> extensive baseline of tests against we can gauge the effectiveness of
>>> and DAST software quite sound. In a way, these tests are analogous to
>>> tests that we, as developers, use to find bugs in our code and help us
>>> improve it, where here the discovered false positives and false negatives
>>> being revealed are being used as the PASS / FAIL criteria for the tests.
>>> as in unit testing, where the ideal is to have extensive tests to
>>> broaden one's
>>> "test coverage" of the software under test, the Benchmark Project strives
>>> to have a broad set of tests to assist in revealing deficiencies (with
>>> the goal of removing these "defects") in various SAST and DAST tools.
>>> This is all well and good, and I whole-heartedly applaud this effort.
>>> However, I see several ways that this Benchmark Project fails. For one,
>>> we have no way to measure the "test coverage" of the vulnerabilities that
>>> the Benchmark Project claims to measure. There are (by figures that I've
>>> seen claimed) something like 21,000 different test cases. How do we, as
>>> people, know if these 21k 'tests' provide "even" test coverage? For
>>> instance, it is not unreasonable to think that they may be heavy coverage
>>> on tests that are easy to create (e.g., SQLi, buffer overflows, XSS) and
>>> a much lesser emphasis on "test cases" for things like cryptographic
>>> weaknesses. (This would not be surprising in the least, since the
>>> of every SAST and DAST tool that I've ever used seems to excel in some
>>> areas and absolutely suck in others.)
>>> Another way that the Benchmark Project is lacking is one that is admitted
>>> on the Benchmark Project wiki page under the "Benchmark Validity"
>>> The Benchmark tests are not exactly like real applications. The
>>> tests are derived from coding patterns observed in real
>>> applications, but the majority of them are considerably *simpler*
>>> than real applications. That is, most real world applications will
>>> be considerably harder to successfully analyse than the OWASP
>>> Benchmark Test Suite. Although the tests are based on real code,
>>> it is possible that some tests may have coding patterns that don't
>>> occur frequently in real code.
>>> A lot of tools are great at detecting data and control flows that are
>>> but fail completely when facing "real code" that uses complex MVC
>>> like Spring Framework or Apache Struts. The bottom line is that we need
>>> realistic tests. While we can be fairly certain that if a SAST or DAST
>>> misses the low bar of one of the existing Benchmark Project test cases,
>>> they are able to _pass_ those tests, it still says *absolutely nothing*
>>> their ability to detect vulnerabilities in real world code where the code
>>> is often orders of magnitude more complex. (And I would argue that this
>>> one reason we see the false positive rate so high for SAST and DAST
>>> rather than err on the side of false negatives, they flag "issues" that
>>> they are generally unreliable and then rely on appsec analysts to
>>> discern which
>>> are real and which are red herrings. This is still easier than if they
>>> appsec engineers had to hunt down these potential issues manually and
>>> analyze them, so it is not entirely inappropriate. As long as the tool
>>> provides some sort of "confidence" indicator for the various issues that
>>> finds, an analyst can easily decide whether they are worth spending
>>> effort on
>>> further investigation.)
>>> This brings me to what I see as the third major area of where the
>>> Project is lacking. In striving to be simple, it attempts to distill all
>>> findings into a single metric. The nicest thing I can think of saying
>>> this is that it is woefully naive and misguided. I think where it is
>>> is that it assumes that every IT organization in every company weights
>>> everything equally. For instance, false positives and false negatives
>>> are both
>>> _equally_ bad. However, in reality, most organizations that I've been
>>> in AppSec would highly prefer false positives over false negatives.
>>> all categories (e.g., buffer overflows, heap corruption, SQLi, XSS, CSRF,
>>> etc.) are all weighted equally. Every appsec engineer knows that this is
>>> generally unrealistic; indeed it is _one_ reason that we have different
>>> ratings for different findings. Also, if a company writes all of their
>>> applications in "safe" programming languages like C# or Java, then
>>> like buffer overflows or heap corruption completely disappear. What that
>>> is that those companies don't care at all whether or not a given SAST or
>>> tool can find those categories of vulnerabilities or not because they are
>>> completely irrelevant for them. However, because there is no way to
>>> the weighting of Benchmark Project findings when run for a given tool,
>>> everything is attempted to be shoe-horned into a single magical figure.
>>> result is that that magical Benchmark Project figure becomes almost
>>> meaningless. At best, it's meaning is very subjective and not at all as
>>> objective as Contrast's advertising is attempting to lead people to
>>> I believe that the general reaction to all of this has been negative, at
>>> least based on the comments that I've read not only in the OWASP mailing
>>> lists, but also on Twitter. In the end, this will be damaging to either
>>> OWASP's overall reputation or at the very least, the reputation of the
>>> OWASP Benchmark Project, both of which I think most of us agreed is
>>> bad for the appsec community in general.
>>> Therefore, I have a simple proposal towards resolution. I would appeal to
>>> the OWASP project leaders to appeal to the OWASP Board to simply mark the
>>> OWASP Benchmark Project Wiki page (and ideally, its GitHub site) as
>>> that the findings are being disputed. For the wiki page, we could do this
>>> in a manner that Wikipedia marks disputes, using a Template:Disputed tag
>>> (see https://en.wikipedia.org/wiki/Template:Disputed_tag) or their
>>> "Accurracy Disputes" (for example, see
>>> and https://en.wikipedia.org/wiki/Category:Accuracy_disputes)
>>> At a mininum, we should have this tag result in rendering something like:
>>> "The use and accuracy of this page is currently being disputed.
>>> OWASP does not support any vendor endorsing any of their
>>> software according to the scores resulting in execution of
>>> the OWASP Benchmark."
>>> that the OWASP Board should apply (so that no one is permitted to
>>> remove it without proper authorization).
>>> I will leave the exact wording up to the board. But just like disputed
>>> pages on Wikipedia, OWASP must take action on this or I think they are
>>> likely to have credibility issues in the future.
>>> Thank you for listening,
>>> -kevin wall
>>> Blog: http://off-the-wall-security.blogspot.com/
>>> NSA: All your crypto bit are belong to us.
>>> Owasp-board mailing list
>>> Owasp-board at lists.owasp.org
>>> Owasp-board mailing list
>>> Owasp-board at lists.owasp.org
>> OWASP-Leaders mailing list
>> OWASP-Leaders at lists.owasp.org
> OWASP-Leaders mailing list
> OWASP-Leaders at lists.owasp.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OWASP-Leaders