Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shape "Tech-debt payment #1" project #1550

Closed
binarylogic opened this issue Jan 20, 2020 · 8 comments
Closed

Shape "Tech-debt payment #1" project #1550

binarylogic opened this issue Jan 20, 2020 · 8 comments
Assignees
Labels
type: task Generic non-code related tasks

Comments

@binarylogic
Copy link
Contributor

I would like to discuss and obtain consensus on what 2 weeks of tech debt work would look like.

@binarylogic binarylogic added the type: task Generic non-code related tasks label Jan 20, 2020
@binarylogic binarylogic added this to the Tech-debt payment #1 milestone Jan 20, 2020
@LucioFranco
Copy link
Contributor

Two big things stand out to me that we should work on as part of this issue:

Clean up unused code

Right now because we make literally everything public, meaning that all sink structs/types are exposed publicly via a public module nested in a public module. We never get warnings about dead/unused code because the compiler can't check future uses. I suggest that we make almost everything private except certain required types to allow the compiler to infer what parts are used/not used.

Add more doc comments

As the code base gets larger and we have more contributors it will become increasingly hard to keep track of specific implementation details. I want us to get into the habit of adding more doc comments explaining high level things related to the source. This could mean for batching to explain concepts around batching. This should enable any engineer to read the comment and ramp up on the actual code without having to jump from place to place.

@binarylogic binarylogic changed the title Outline and plan tech-debt payment #1 Sahpe "Tech-debt payment #1" project Jan 27, 2020
@binarylogic binarylogic changed the title Sahpe "Tech-debt payment #1" project Shape "Tech-debt payment #1" project Jan 27, 2020
@MOZGIII
Copy link
Contributor

MOZGIII commented Feb 6, 2020

I'd be happy if we try to address #1514. Editor responsiveness with rust-analyzer is not as bad as with rls, but it's still a bit painful.

@LucioFranco
Copy link
Contributor

The new rust-analyzers in the past week have been very good for me 👍 If you haven't tried it out I do recommend it!

@lukesteensen
Copy link
Member

Proposal

I've gone through all the tech debt issues we've amassed and attempted to triage them. There are a few larger groupings that we should be able to knock out with well-planned changes, and a lot of one-off smaller things that can be tackled independently.

Below, I've broken down the larger items I think we should tackle.

Simplifying Shutdown

I believe this may be a partial blocker to the tokio/futures piece below, but it's worth discussing on its own.

Our current shutdown strategy is very difficult to understand. It's dealt with both at the topology level and within each source, and the implementations have no direct link, making them both more confusing than they should be.

While there are proposals floating around about various forms of structured concurrency that would be very useful for this kind of problem, none are far enough along that we should take the risk of adopting them. Instead, we should find the smallest, simplest change we can make that gets the code to a reasonable level of understandability.

This will require some exploration and a proposal process, but I'm reasonably optimistic that something as simple as passing a shutdown signal into each source at startup would make things much easier to follow.

Plan of attack

  1. Explore options for simplifying shutdown
  2. Write up proposal with spike-level code
  3. Discuss and come to a consensus on how to proceed
  4. Implement the agreed-upon changes

Related issues: #1091, #1195, #1702

Tokio 0.2 / Futures 0.3

This is the big one. We need to keep up with the ecosystem here to avoid becoming blocked on upgrading important dependencies.

Our goal should be to set ourselves up for easy, incremental change. Currently, we have some futures 0.3 code that we map back to futures 0.1 via the compat layer. Success here means flipping that around. We'll still have plenty of futures 0.1 code, but it should be compat-mapped into futures 0.3, which is then the default for all new code.

Actually moving totally to futures 0.3 will be a long, incremental journey. This piece of work should lay the foundation and plot a rough course.

Plan of attack

  1. Investigate and clear potential blockers (Investigate file descriptor use with tokio 0.2 #1695, Verify that tokio 0.2 fixes concurrency bottleneck #1696)
  2. Make the minimal change onto tokio-compat as described in Upgrade to tokio v0.2 and std::future::Future #1142
  3. Test heavily
  4. Triage existing futures 0.1 code

Other related issues: #1639, #716, #990

Simplifying Sinks

We've built up a lot of layers for contributors to use when building sinks. This has gotten to the point where the number of moving parts is much higher than it should be, making it difficult to work on and understand sink code.

The goal here should be to collapse and simplify the conceptual pieces involved in sink building, particularly around batching. There should be a few simple, reusable pieces for contributors to reach for, with clear names and obvious APIs.

Plan of attack

  1. Survey sinks for patterns of both behavior and existing component use
  2. Based on those observed patterns, write up a proposal for improved design of the shared components and a plan for migrating existing sinks, including spike-level code
  3. Discuss and come to a consensus on how to proceed
  4. Implement the agreed-upon changes

Related issues: #1666, #1713, #1659, #1408, #1676

Code Organization

As @LucioFranco noted above, our pervasive use of pub modules means that we miss out on warnings for things like unused code. This largely falls out of how we organize our code between main.rs and everything else.

We currently implement an enormous amount of logic directly in main. This was a simple way to start but has become a bit overwhelming. By carefully breaking down and factoring out that logic, we can have a much smaller, more understandable main that requires far less to be made public. We can then scale back what's exposed and regain those useful warnings.

Plan of attack

  1. Survey main.rs for coherent chunks of logic that could effectively be factored out, either as functions or some Application-like struct with methods
  2. Write up a high-level proposal for restructuring (i.e. not method-by-method, but a rough sketch to help others see direction)
  3. Discuss and come to a consensus on how to proceed
  4. Implement the agreed-upon changes

Related issues: #1383


There's an unending list of other tech debt items I'd like to tackle, but these four are (in my opinion) the most strategically important and well defined. They focus on keeping our component code easy to understand, modern, and clean.

I chose not to include things like improving compile times because I don't believe we have a good enough idea of what exactly needs to be done there and it's unclear how big an impact we could actually have.

There are others like adding doc comments that I think are important but make less sense as part of a one-off, concerted effort. These are the types of things we should start to do as part of our normal work, striving to always leave code more understandable than when we found it.

Let me know what you think!

@binarylogic
Copy link
Contributor Author

Nice write up! Those look like good projects. Based on what you said:

I believe this may be a partial blocker to the tokio/futures piece below, but it's worth discussing on its own.

It sounds like "Tokio 0.2 / Futures 0.3" is the only project blocked, and we could start on the other 3 in parallel (if we wanted).

@binarylogic
Copy link
Contributor Author

binarylogic commented Feb 10, 2020

As part of the "Simplifying Sinks" work we should look at #1376. Nothing, we don't have to solve this as part of the project, it's just an interesting PR to look at.

@lukesteensen
Copy link
Member

#299 is also relevant to the "Simplifying Sinks" work.

@LucioFranco
Copy link
Contributor

I'd like to add another note here, this issue #1402 seems like a good candidate that we should clean up. I'd like to also propose that we move some of our http stuff outside of sinks::util into crate::http since we now have two transforms using that code.

@binarylogic binarylogic removed this from the Tech-debt payment #1 milestone Feb 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: task Generic non-code related tasks
Projects
None yet
Development

No branches or pull requests

4 participants