Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Host outside of IIS #22

Closed
vors opened this issue Dec 8, 2015 · 9 comments
Closed

Host outside of IIS #22

vors opened this issue Dec 8, 2015 · 9 comments

Comments

@vors
Copy link

vors commented Dec 8, 2015

I tried host generated website with a simple python webserver

python -m http.server 8000

Search doesn't work in this case.

@KirillOsenkov
Copy link
Owner

This is by design. The search part was written with ASP.NET Web API and would require a complete rewrite to support other technologies.

@atifaziz
Copy link
Contributor

atifaziz commented Dec 8, 2015

@KirillOsenkov What is that requires any server-side processing? Could one not just generate a static web site (that could then be hosted on any web server/site, including GitHub pages) & use JavaScript to implement client-side search? Is there anything architecturally preventing that?

@KirillOsenkov
Copy link
Owner

The only thing that's on the server is a list of declarations:
http://sourcebrowser.azurewebsites.net/#Microsoft.SourceBrowser.SourceIndexServer/Models/IndexLoader.cs,901622aa75abf138

and a few auxiliary data structures (list of all assemblies and all projects).

But primarily it's a list of all declared symbols:
http://sourcebrowser.azurewebsites.net/#Microsoft.SourceBrowser.SourceIndexServer/Models/IndexLoader.cs,e7decef3080779f5

Each declared symbol is basically a type with 5 things:
http://sourcebrowser.azurewebsites.net/#Microsoft.SourceBrowser.SourceIndexServer/Models/IndexEntry.cs,671f1c8b3a24aa51

Assembly number, glyph (icon), name (what you search for), symbol ID (the hex number used in hyperlinks to it) and the description (usually the full namespace and type and member name).

You can store this static list on the server as .txt, download it in JavaScript and implement search on the client without even going to the server.

The problem with that approach is SourceBrowser was designed to be highly scalable. It easily works with 60 million lines of code (all of Microsoft Developer Division source) and can scale to 100 million easily. This means around 6 million symbols currently (4 GB memory compressed). This is not something you can do on the client. Holding this list on the server is easy.

Implementing the feature we're talking about would basically mean removing SourceIndexServer and replacing it with a client side solution. This is a non-trivial amount of work and not something I'm willing to do (no time). I also don't believe it will scale past 100-200 thousand symbols, which would cover most midsize codebases but not the bigger ones.

If someone is willing to do the work, feel free to do so in your own forks, but I won't be taking this as a PR into my original repo (lots of support work I'm not prepared to do).

Thanks!

@vors
Copy link
Author

vors commented Dec 9, 2015

Thank you for the explanation!
I think the description is a little bit misleading:

Create and host your own static HTML website

static HTML gives the impression that there is no need in running it in IIS.
Maybe it's worth to elaborate this part about server-side search in the project description.

@KirillOsenkov
Copy link
Owner

You're right. I'll update the description. Thanks!

@atifaziz
Copy link
Contributor

Thanks for a summary of the design & the background on why the server part may be necessary. I had a hunch it might be about the scalability of the index/source/search when I browsed through the source on its initial publication. I'm guessing that the unmanaged memory allocations are also to work around GC pressure and limits, especially if you're going to load 4GB of compressed memory?

If someone is willing to do the work, feel free to do so in your own forks, but I won't be taking this as a PR into my original repo (lots of support work I'm not prepared to do).

Understood. However, it may help to depict the architecture & provide background on some of the design decisions (just as you did above) in a wiki so someone stands a good chance of doing that (or even assessing the feasibility of doing it) without first getting their head wrapped around the entire server code base.

@KirillOsenkov
Copy link
Owner

Yes, exactly. Using tricks like ushort (for tighter struct packing), allocating in native memory (to avoid 24 bytes per object overhead for 64-bit processes), compressing descriptions are all tricks that helped reduce the memory consumption on the server by a few gigabytes.

Good idea about the Wiki. I will add it.

@atifaziz
Copy link
Contributor

@KirillOsenkov That's a start 👍 and thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants