Enterprise Security Weekly – Rami Essaid, Distill Networks

April 26, 2017

Rami Essaid is the CEO and Co-Founder of Distil Networks. With over 15 years of experience in telecommunications, network security, and cloud infrastructure management, Rami advises enterprise companies around the world, focusing on embracing the cloud to improve scalability and reliability.

Paul: This week we welcome Rami Essaid from Distill Networks to the show. In the news this week, evaluating end point protection, trouble at Tanium, micro-virtualization, spying on your users, Mac meets anti-malware, Kazby's and Facebook work together, all that and more on this episode of "Enterprise Security Weekly."

Intro: This is Security Weekly, for security professionals, by security professionals. Broadcasting live from G-Unit Studios in Rhode Island, it's the show where we talk about security vendors and aren't afraid to name names. It's "Enterprise Security Weekly."

Paul: Welcome, everyone, to "Enterprise Security Weekly." This is episode 41 for April 20th, 2017. I'm your host, Paul Asadoorian coming at you live from Rhode Island. On the lines via Skype, from none other than South Dakota, Mr. John Strand. Welcome to the show.

John: And I'm currently communicating with the people at Distill Networks, their online chat people here. So I just went to their website, it popped up, you know, I wanted to know more about Distill, and I figure I will communicate here with Austin and we'll see how it goes. So...

Paul: It's so funny. Yeah, I wanna talk about that. I want to also welcome our guest from Distill Networks who we've talked about on the show before. We couldn't quite get through some of the marketing, and what happened was they heard about it, they reached out, and I got on the phone with Rami and I asked Rami...vendors actually struggle with this question. Rami, you did not struggle with like what problem do you solve, why do you do it better? You'd be surprised how many people like fall over. Rami had awesome answers. I'm like, "Dude, you're coming on the show to talk about the problems that you solve." So I'd like to welcome Rami Essaid from Distill Networks. You are the CEO and co-founder, is that correct?

Rami: That's right.

Paul: Rami, welcome to the show.

Rami: Thanks so much for having me.

Paul: Yes, it's nice to have you here. So your company focuses on the problem of bots, and I think...First, I wanna talk with you and define, like, what is a bot in this context? And we've had this discussion already, but for the benefit of our listeners, I think this term gets thrown around and it can have multiple different meanings, so if you could contextualize it for us that'd be fantastic.

Rami: Yeah, so when we are talking about bots at Distill, we're talking about any kind of web automation. So a script or a program that's accessing your web infrastructure, your APIs, your web services, and pretending to be a human, or automating tasks that humans would end up doing on your website.

Paul: To me, and correct me if I'm wrong, when a data scientist helped define the difference between AI and machine learning, this is very much AI in my mind.

Rami: Well, it could you be. I mean you have bots that are driven by, you know, very programmatic things that are, you know, really simple to identify and then there's some that fall into kind of a more sophisticated realm that you have to dive a lot deeper into to be able to understand and identify, you have to have a lot more intelligence behind.

Paul: So what are some of the problems that, especially from a security perspective because this is a security show, what are some of the problems that arise from these bots?

Rami: So bots are a tool. They're, in fact, a hacker's favorite tool now, to automate and amplify their attacks. So whether it's brute forcing through a login page with a stolen list of credentials or running a vulnerability scan, even using as a man in the middle proxy, what it is is an automation of the commands that you're trying to run. And so, any kind of web application vulnerability or any kind of attack against the web services is really, nowadays, being done through some sort of automation. So if you take away the bot, if you take away their ability to use those tools, then you are just a little bit more secure as a website. That's aside, obviously, from the application DDoS and the full denial-of-service, which most people have.

Paul: Correct.

John: You know, Paul, it actually sounds like more of a proactive kind of active defense. It's not like actually developing static signatures for certain types of attacks, but it's...so what type of mitigation steps do you take whenever you do see something that is suspicious?

Rami: Yeah, so what we try to do is draw a circle around what we know is good traffic, legitimate human traffic, and then anything that falls outside of that, either sort of behaviorally or through telemetry that we've seen before, that we intercept, challenge, and try to predict if it's going to be a bot. It's not reacting to when somebody does something bad, it is, just like you said, trying to predict when it's going to be a malicious actor because this falls outside of what legitimate human traffic looks like.

Paul: And one of the use cases that I want to explore, I think, delves into your world of pen testing, John. I'm not sure how much either of you want to disclose on the subject, but I happen to know through intelligence sources that when, let's for example say, I'm going to conduct a phishing exercise, or I'm an evil bad guy and I'm gonna do a malicious phishing campaign against an organization, I wanna learn as much about that organization as possible, especially its users. So I go to popular sites like LinkedIn and Facebook and Twitter and I really automate with a bot the data collection and intelligence gathering for this company. And so how can you help those companies with this problem and prevent that information harvesting from happening? I know Facebook and all of the large companies have these large sources of data do their best to help prevent that from happening, but it's not perfect.

Rami: No, it’s not. And we work with some of the bigger brand names out there. There's enormous teams that are tasked with doing this, but they still haven't gotten it quite right. And we actually work with some Fortune 50 companies, you know, dozens of Fortune 500 companies and tens of thousands of domains across the web. We do this as a purpose-built solution and what's beautiful about it is we can share the knowledge across each customer so even somebody like LinkedIn who sees a significant amount more traffic than anybody else, we still see more traffic than them in aggregate. And so we have more knowledge about what bots are doing and we have over 100 engineers that are purpose-built for this problem. So large or small companies can utilize, you know, a purpose-built solution like this.

John: So how do you guys handle...sorry, I'm actually communicating with Austin over here, and just for the record, this is actually kind of cool. Like I said, I went to the website and all of a sudden I get this pop-up and I'm talking with Austin. You know how a lot of times like the guy, "Hey, would you like to know more about our company?" It's not really someone in that company, it's like some service off someplace. But I think Austin is a real person...

Rami: He works for us.

John: ...because whenever I was talking about it, he came back and he said, "Hey, they're doing the recording in our office right now." And I'm like, I can see the logo in the back. So that actually shows that you guys have real people answering your web stuff, which I think is pretty cool.

Paul: It's not a bot. That's really funny.

John: Yeah, so at least I don't think Austin is a bot. So anyway...

Paul: It's a really good one if it is.

John: It is a very good bot. They know more about bots than anybody else, let's remember this. So the question I have is how do you differentiate crawling activity that is legitimate? You want it to happen from like Google bot, MSN bot, and how do you differentiate that from somebody who's going through and scraping your website?

Rami: Yeah, so we have a number of different bots that we whitelist. We look at the header information that they send, the IP space that they come from, and we've pinned those people together to a very particular set of credentials that we know we can whitelist off of. People that fall outside of that, we allow our customers to identify individuals and say, "Hey, we want to whitelist this ourselves, or not," and then everything outside of that, we count as malicious. Right, if I don't know about it as being one of the biggest, you know, players out there and if our customer doesn't know about it and you're doing some sort of crawling activity, then you might as well be bad in our book.

John: So whenever you're looking at those IP addresses, traditionally in your experience, those legitimate bots that you want coming to your website, do those IP addresses shift quite often or do you find that a lot of the big players are pretty much coming from a very, very tight ASN network, BGP, something like that, that actually says, "Yeah, it's coming from Google, but more importantly, it's coming from this specific set of IP ranges in Google?"

Rami: It is important to go beyond an ASN because somebody like Google, for example, it's a great example, Google has IP space in which you can rent Google cloud servers. And so you rent a Google cloud server, it's gonna come from a Google IP, and you can easily set your user agent to be Google bot. And that's a really common way for people to pretend to be Google and get through most defenses and crawl your site at a really high rate because you never want to think to block Google. So it is a lot more pinned down, they do come from a much narrower set. But we do see some variance. We do see that sometimes they change it. You know, once a year we see companies like Microsoft change the user agent that they send back, and so we have to be on top of it. Some major companies we work with as partners, some major companies we just have to know to expect some adaptation and continue to update our system.

John: And kind of continuing down that path, whenever you're looking at the communications that come in, how often...and this is something Paul had a presentation about a while ago that was hilarious. What percentage of these, like, crawls and attacks do you see where the bad guys aren't even changing their user agent string? It's coming from like Nic Do or Acunetix and you're just like, "Come on, could you at least try a little harder?"

Rami: You know, 10% are really, really dumb. Ten percent aren't even a browser. And then another 5% to 10% actually come in from user agents that are so outdated. I mean we see Firefox 5, like just things that don't even exist anymore. So there is a whole swath of people that are just really, really dumb and not even trying. But after the first 25%, 30%, you really get into people that are using legitimate user agents. And it's fascinating, they update them year-by-year, right. They correlate to the latest browser, the latest version of Safari, the latest technology so that they can really hide in there. And then you have the top 25% that are actually using legitimate browsers and using, you know, embedded malware or automation techniques that are automating real browsers. So those, you know, you can't even tell, if you're looking at the user agent, that they're not a real browser or that they're using automation.

Paul: So Rami, when I have my web property set up and I make a major code change, is there some re-learning that has to happen inside of your software so that you know that it's not people that are changing their behavior, it's the application that has changed behavior?

Rami: Yeah, absolutely. So we continue to reclassify. We run our classifiers every 30 minutes. It's really important to continue to adjust the learning aspect of what's going on because things could change both on your code, but things could change, you know, in the broader world, right. You could have a sale that all of a sudden drives a swath of traffic we wouldn't have ever seen before. You could do a lot of different things to impact that, and the last thing we wanna do is have any false positives. When you're working with machine learning, there's gonna be some false positives, but we keep ours to less than 1 in 10,000 and we always offer a path out so it's never a hard block. It's more of a introduce friction slowly to see if they're able to overcome the challenges that we present and prove their identity as a real person.

Paul: Well, that's interesting. So if you recognize behavior that falls outside of normal operations, you're not just like blacklisting those IPs and blocking them, right? You mentioned a progression, like what is that progression?

Rami: Yeah, so first of all, we're not really IP-centric. What we do is we use a device fingerprint to track a computer. It helps us not have to reinvent the wheel. Otherwise, you're gonna play IP whack-a-mole in trying to prevent bad guys because it's too easy to go through TOR, go through a VPN, and just get a new IP every single request. So the device fingerprint track follows you no matter where you are. A little big brother-ish, but it's much more effective and I think a lot of web security needs to shift that way. But when we're talking...

John: So whenever you're fingerprinting a browser, what are the attributes you're fingerprinting on? Are you doing it like on the plugins that are available, the browser resolution? Like, you're doing much more than user agent string. What are the different attributes, and you don't have to give all of them of course, but you are looking into?

Rami: Yeah, so we throw away anything that's passed back by the browser. So mime types and user agents and languages, all of that, we discard because that's the first place that the bad guys are gonna go spoof. We use more things like canvas fingerprinting, fingerprinting the audio card, looking at web RTC and probing the local network, those kinds of things, to be able to pull back attributes that create the fingerprint.

John: Are you doing anything at the TCP/IP stack level? Like basically looking at the window sizes of some of those to try and do a little bit better fingerprinting?

Rami: You can look at window size, but that's actually...Because we're embedded or we're pulling our information, oftentimes, from JavaScript, you can actually very easily spoof that within the browser. So screen resolution is what the browser reports as screen resolution. That's not what the actual machine screen resolution is, at least on desktops, and so we discount that very heavily.

John: Oh, no, I was talking about TCP/IP, the window sizes of protocol header field. It's the amount of data that's loaded into [crosstalk] operating system before it's loaded. Yeah.

Rami: Sorry. Sorry, I misunderstood. So we're often sitting behind either a CDN or behind an infrastructure, like you know, behind your firewall, behind the load balancer and then you have our box, and so all of that information goes away. We're interfacing with the load balancer and so we don't get any access to that. If we were the first line of defense, absolutely. It would be amazing to be able to do that, but unfortunately not.

John: I got all excited for a second but that makes absolute sense.

Paul: Yeah, it does. So Rami, your technology sits inside of...so JavaScript is loaded by whatever browser type thing is accessing the website, is that correct? Is that how you’re hooking the clients and fingerprinting them?

Rami: Well, we're actually sitting as an in-line device. We either integrate into your CDN, integrate into your cloud environment, or on-premise in a physical or virtual box. And we inject the JavaScript ourselves, either an in-line proxy or an in-line bridge. And we'll do the JavaScript injection. We'll insert some hidden CSS, some hidden honey pot links. We will manipulate your HTML. We're gonna do a lot of different things because catching bots is kind of like peeling an onion. You have to put in just layer after layer after layer of things to be able to catch all the different types of bots. There's no one silver bullet that can catch all of them. And we don't want people to key off everything we are doing, and if we threw the kitchen sink into JavaScript, it would just take five seconds to load. So all of those reasons, we like to be a device sitting in-line and then slowly interject different things.

John: So I'm guessing that you guys are pretty paranoid about people attacking your company because if you're injecting JavaScript into the browsers of not just bots, but also legitimate browsers that users are going into, like you gotta admit like any nation-state level attacker would absolutely love to have that type of access. So you know, is that something that keeps you awake at night trying to keep the security of your company solid?

Rami: That is honestly my biggest fear, you know, that keeps me up at night. The one thing that could monumentally change the path that we're on is some sort of breach or hack. We take security, obviously, really carefully and seriously at the company. You know, I yell at any employee who doesn't have, you know, everything locked down. We are really diligent about it and we have people whose job it is to just secure us internally, not even secure our product, but just secure the company internally.

Paul: No, that's really good. So do you find that attackers that are writing bots or any kind of attack tools, have you noticed that they've developed fingerprinting techniques for your specific events in JavaScript that are trying to detect them? So like a cat and mouse game that's going on?

Rami: It's a constant cat and mouse game. They're both trying to elude the things that were catching them within JavaScript and trying to manipulate their fingerprint. So we see that happening on a regular basis. We've actually even developed a system to try to catch variations of fingerprint. Right, we know our fingerprint is not infallible because it is, at the end of the day, client site data that we're pulling out, and you know, if you aren't in control of the client you can manipulate what data you send back when you know what we're looking for. So, we look at the variances between fingerprints and then look at the reputation of the IP space that you're coming in from and a number of different things. So there's, you know, a constant arms race that our team is working on to try to stay one step ahead of people. That's what makes it, actually, a reason to buy.

John: [crosstalk]...bot people, but how do you guys handle...like a lot of users are now starting to moving to using Ghostery or using NoScript and things like that. How does your system handle that when people are basically just shutting off your JavaScript completely from running? I know that that's an edge case, but it's just interesting.

Rami: It's not really that edge...I mean there's a lot of people that run these NoScripts, and so we can re-segment our JavaScript checks into different things. So we know, for example, that the TOR browser shuts off any kind of canvas fingerprint. Right, they know to recognize it and they shut that down. So we are able to segment the different things that we do into little encapsulations and because we're sitting in-line, we embed them into the code of the page. We can embed them into your actual page. Ghostery keys off third party tags. All these ad blockers keys off JavaScript that's running from a, you know, GA.dot or running from, you know, a third party like a double click or double verify. When it's code running within the page, you don't know what to key off, and especially when it's randomized code when we inject in different places when there's no set code path to follow. So we've seen that before, it's something that we looked at. It's the reason that we, again, sit in-line and we don't inject as a third party piece of JavaScript. 

John: So that's gotta be kind of a tightrope walk, right, if you're actually injecting code directly into your customers' pages as they are going through. Do you ever worry about actually stepping on the page somehow and actually scrambling it and making it so it doesn't render properly?

Rami: Yeah, so we go through a really heavy QA process. Our injections work 100% of the time when the page is coded properly. If the page isn't coded properly, let’s say they have the wrong types, they have the wrong content types, we could mess things up. And so we go through a pretty heavy QA process before we launch a customer to identify those kinds of issues. For example, if you tell us, "Hey, this is a piece of HTML," but it's really a piece of JavaScript, well, you can’t inject JavaScript randomly into JavaScript and expect it to work. I can inject JavaScript anywhere into HTML. As long as it's not inside of a tag, that'll be fine. But if you put JavaScript into JavaScript, it'll break the page. So those are the kinds of things that we have to look out for.

John: Very cool.

Paul: That is really cool. Since you have code that's running inside of the browser, have you been careful not to trigger alarms that are set to look for any kinds of malicious code? In other words, I've got a browser plugin or some kind of endpoint protection? Like, is it flagging that site as malicious because you're tracking it this way?

Johan: Yeah, that would be another thing too.

Rami: Yeah, so we've had to talk to a lot of the endpoint security companies. We've gotten inquiries, and so we make sure that there's little nuggets that they can key off. And we've talked to them about, "Hey, here's what we are doing," so that the endpoint security companies can know not to look for us or not to flag us as potential malware. That did come up unexpectedly for us two years ago. That was a surprise, "Whoops, we didn't realize we were doing that," as we started getting bigger and bigger.

Paul: That's really cool.

Rami: Yeah, you forget about how intertwined everything is and you do one thing on one side and you don't realize how that percolates across so many different types of companies.

Paul: So what is the real resounding value for your product, Rami? Like, what actions do people get to take that are really, like, wins for the customer?

Rami: So it's preventing the bot traffic, right, so that you can reduce infrastructural load off of your backend, that's a really simple one. You can cut out 30%, 40% of your traffic as either benign or malicious bot activity. But it's really the peace of mind of knowing that you're more secure. So things like account takeover, credit card fraud, brute force attacks, all of those things are bot-based, and the majority of which happen because of bots. So if you're vulnerable to those kinds of attacks, it's a really big deal. We just prevented millions of dollars from a major retailer in gift card fraud. There was a big ring automating attempts at different gift card sites and just checking to see, you know, does this gift card exist, does it have a value? When we worked with this particular retailer, we were in-line, prevented the bot from seeing the answer to that, where other retailers had to completely shut down their online gift card system, and we saved them millions of dollars of fraud. So that's the kind of good hygiene that is preventative to having to worry about these things retroactively.

Paul: John, you wouldn't know anything about the...

John: Good, good list of testimonials from your customers, and that's something we don't see a lot with security products at all. People are willing to just basically put up and say, "Hey, this product has worked for us," which I think is pretty awesome.

Rami: Honestly, I'm baffled by that, but I love it. I thank my customers so much, but we also listen to them and collaborate with them. But yeah, I mean, dozens and dozens of customers are saying, "Hey, yes, we're willing to talk about how good of a job you've done."

Paul: So on the flip side of that, and we all know the big social network that, somewhat, numbers are inflated due to bots, which I imagine that you probably are not working with them. But there are others out there that are relying on some of their numbers, maybe to go get their next round of funding, and if they influence your solution, it's going to drastically bring down those numbers. Like, have you run into that situation, and like what's your advice for companies that are trying to separate real users from non-legitimate users?

Rami: Well, I mean, if you started with a good baseline, then you'd be a much healthier place anyway. The problem of, you know, "We've been reporting on this and we can’t filter it out," it all depends on, you know, what that group of users is doing. If you're filtering out the stuff that's just malicious, then who cares if you're reporting on it. You wanna make sure that it's not hurting you. Now, if you're filtering things that are benign, you have the granularity within our system to say, "You know what, keep everything in monitor mode but only block things or intercept things that are attacking the login page, or trying to do account takeovers, or trying to do credit card fraud, or doing a vulnerability scan across our site." So you can pick and choose what you block and what you let through, and so that way, you know, it doesn't have to impact you. We have, actually, publishers that worry about this, not necessary from getting their next round, but they make money off every single bot viewer. So before they turn any kind of intercept and blocking on, they wanna know how big of a problem do I really have?

John: Well, and I would also say, kind of flipping that whole question around a little bit, Paul, if I was somebody that was going to be sinking $100 million into a company, then this would be something that I would want to make sure that the data that I'm getting from that company is accurate. And if you had something like this in front of it, you can actually say, "Hey, look, our numbers, as far as people that are coming to our website, are actual real human beings." And I think that that provides a lot of value too.

Rami: Yeah, I mean if I was a shareholder in that social network that you're talking about, I don't understand how they're not clamoring for this. How they're not saying, "How many real users do you actually have?" Nobody knows, right? If you're an investor, I would absolutely wanna know the real data behind things.

Paul: And now, speaking of investing, Rami, I see you've taken a Series C about a year ago for $21 million. What's your next step? Are you looking for another round of funding? Are you looking to go public, or you know, what's your strategy from here on out?

Rami: So we are on a path to hit cash flow positive, so we don't necessarily need to take another round. We may choose to do it.

Paul: Good for you.

Rami: Yeah, it's tough to do. It's also a pain in the...I liken venture capital to being a drug addict. You go from, you know, one high to another and then you're chasing the dragon in between, just trying to find your next fix. So getting off this path, I think is healthy for a lot of companies. And we'd like to go public, but we keep optionality open just to figure out where we wanna go next. What we really care about is actually solving the problem. I'm a, you know, engineering background. My co-founders are engineers. We more nerd out on the product side and figure if we make good products that people want to buy, the rest will work itself out.

Paul: No, that's awesome. Yeah, it's refreshing to hear. So product-wise, do you have any like big product announcements that you wanna share with our audience, new features?

Rami: Yeah, so we started with protecting websites against automation, against bot attacks, right. And we noticed that our customer noticed that the bots shifted to attacking their web APIs. And so last year we introduced a service to protect their APIs, and then we saw the attack factor shift to their native mobile apps. So people could reverse engineer the API calls that interacted with their native mobile apps and then they realized they could get a backdoor to the exact same systems, the same databases, the same content, the same login, etc. So we're about to release, this quarter, SDKs that interact with your native mobile app that plugs into our appliance. So you install our SDK on your app and our appliance will sync up with that SDK, make sure that the traffic is legitimate before we let that API traffic go through. So protecting native mobile apps against automation bots is the next step for us.

Paul: Oh, that's fantastic I like it. John, any closing questions?

John: Nope, I think we're doing good.

Paul: Awesome. Rami, thank you so much for coming on the show and sharing your great knowledge of the space with our audience.

Rami: Hey, thanks so much for having me.

Paul: With that, we'll take a short break and come back and talk about the security news for this week. So stay tuned, don't go anywhere.

Previous Flipbook
Bot Defense for API Security Data Sheet
Bot Defense for API Security Data Sheet

APIs lack basic capabilities to detect and prevent to automated attacks. Learn how to protect APIs with Bot...

Next Flipbook
Cyber Security Threat Series: Web Scraping eBook
Cyber Security Threat Series: Web Scraping eBook

If you have a website, its content has been scraped by bots. What are web scraping bots and how are they po...