About searchcode

search code

Type in anything you want to find and you will be presented with the results that match with the relevant lines highlighted. Searches can be filtered down using the filter panel. Some suggested search terms:

Function/Method names E.G. Format, re.compile lang:python
Constant and variable names E.G. ERROR, username
Operations E.G. foreach lang:c#, while(len--
Security Flaws E.G. eval $_GET
Usage E.G. import flash.display.Sprite;
Special Characters E.G. @microsoft.com
Within Terms E.G. @micros
Finding Emails E.G. mailto: gmail.com
Library usage E.G. google guice createinjector lang:java

Note longer search terms will produce better results and all terms should be a minimum of 3 characters.

Searching

Type any term you want to search for in the search box and press the enter key. Generally best results can be gained by searching for terms that you expect to be close to each other on the same line.

Other characters are treated as part of the search itself. This means that a search for something such as i++; is not only a legal search it is likely to return results for most code bases.

If a search does not return the results you are expecting or no results at all consider rewriting the query. For example searching for Arrays.asList("a1", "a2", "b1", "c2", "c1") could be turned into a looser query by searching for Arrays.asList or Arrays asList. Another example would be EMAIL_ADDRESS_REGEX for email address regex.

To view the full file that is returned click on the name of the file.

Filters

Any search can be filtered down to a specific source or identified language using the refinement options. Select one or more of each and click the "Apply" button to do this.

Filters on the normal interface persist between searches. This allows you to select a specific repository or language and continue searching. To clear applied filters uncheck the filters individually and click on "Filter Selected". You can also click "Clear Filters" button to clear all active filters. The HTML only page filters are cleared between every new search.

Estimated Cost

The estimated cost for any file or project is created using the Basic COCOMO algorithmic software cost estimation model. The cost reflected includes design, coding, testing, documentation for both developers and users, equipment, buildings etc... which can result in a higher estimate then would be expected. Generally consider this the cost of developing the code, and not what it is "worth". It is based on an average salary of $56,000 per year.

Privacy

searchcode does not capture or store personally identifiable information, and as a general rule is only interested in data in aggregate. searchcode will never share internal logs with any third party unless required legally (avoiding jail is a serious motivator).

What is collected? When accessing any page or running any query the following may be logged:

Search Term (if applicable)
Time the page or query was run
General Details, such as, page of the search results, how many results were found, if the searcher is a robot

However personally identifiable information (IP address etc...) may be used at run time for determining if a hit to a page or search is by a robot or genuine user. searchcode reserves the right to downgrade the performance for robots to provide a better experience for users.

Feedback

Should you wish to provide feedback on this privacy policy please contact Ben Boyter via email at ben@boyter.org or via twitter @boyter

News / Coverage

The Story

Around 2010 I was looking for a project to work on. I was reasonably unsatisfied in my role at the time and decided to work on something to keep my mind occupied. Following a startup hackathon were I realised that actually shipping something is both easier and harder then you would think I started work on what became known as searchforphp.com as I had always wanted run my own search engine but did not have the skills nor capitol required to build a web one. I released searchforphp with a post on Hacker News and it recieved some pretty reasonable attention.

Following the release on Hacker News I started talking to Gabriel Weinburg of DuckDuckGo fame. I wanted to emulate the !bang syntax into searchforphp and thought I would ask first. This started a relationship where I started supplying documentation sources for the DuckDuckGo zero-click information. I suspect I may have been the first contributor to DuckDuckGo in ernest. Gabriel asked if I was planning on expanding out to other documentation and I started down the path of creating searchforpython.com searchforjava.com etc... After a short time I realised that this was probably not the best of ideas and started looking for a domain to encompass all. Gabriel suggested searchco.de and thus searchcode was born.

A few months after releasing the first few versions of searchcode Google announced that they were going to sunset their code search. In a passing comment Gabriel asked if I was planning to implement code search. I had not considered it up till that point but decided that it shouldn't be too difficult and started work. Boy was I ever wrong about it not being too difficult. The first release indexed about 1gb of source code and was hosted on a VPS hosted by Atum.com

After time it became apparent that the VPS did not have enough disk space or power to support the growing index. It was moved over to two dedicated servers running at Hetzner. A little bit later it was rewritten from PHP into the a Python Django version and moved to the current domain of searchcode.com Over time the searchable code base to over 20 billion lines of code.

The next iteration expanded to four dedicated servers and was rewritten in Go with the sphinx index being replaced with Manticore.

The most recent incantation is now running on a single dedicated. It has also been rebuilt using the Go programming language, running a custom bloom filter backed index. With the closure of Ohloh Code Search it is now probably the most widely spanning code search engine and one of the largest next to GitHub the only with a freely available API, and certainly the largest web available code index run by a single person. Whats next? Well the move allows a much more scalable infrastructure and the eventual goal is to index even more of the worlds source code.

Team / Contact

searchcode is currently the work of a single developer standing on the shoulders of giants.

Feel free to contact me at ben@boyter.org or via twitter @boyter or follow developments at https://boyter.org/