From: "maciej.mensfeld (Maciej Mensfeld) via ruby-core" <ruby-core@...>
Date: 2023-10-15T14:16:18+00:00
Subject: [ruby-core:115056] [Ruby master Feature#19744] Namespace on read

Issue #19744 has been updated by maciej.mensfeld (Maciej Mensfeld).


If I may.

### `::` scope

Similar to Jeremy, my gems heavily use the top-level reference. Like others, I also have namespaces that collide with the root once. For example, `Karafka::ActiveJob` operates by referencing itself and its internals locally but also refers to `::ActiveJob`. In some scenarios, I use it to ease with readability for developers as it is easier to track things starting from the root level. Sometimes, it is needed because of name conflicts.

### User Experience

I am unsure if namespaces will be easy to debug/work with. When I debug gems, at the moment, it is fairly easy for me to understand the versioning and to be able to modify them in place when researching some bugs/monkey patches / etc. With the addition of the namespace, I can imagine this may be a bit cumbersome due to the requirement of understanding the scope in which a given piece of code operates. 

### Security

#### Code Execution

There are no explicit security risks I can imagine, but I would say that it may, as others mentioned, loosen the pressure on upgrading. I can also imagine that it may be a bit confusing to get multiple records on the same vulnerabilities per project from tools like bundler audit, as there may be scenarios that the same vulnerability will be assigned to a few versions in use. 

Another question is on the complexity of things like reachability analysis. It may become a bit more complex (though not impossible - look npm).

#### Dependencies

Please read the Bundler section below.

### Bundler

- While the plugin API is not widely used, there are some (including me) that utilize them. I'm almost certain that introduction of namespaces will cause API changes to the plugins API.

> 
    Bundler should resolve dependencies in the same way as the current one (Most libraries have just 1 version/copy under the vendor)
    Bundler should recommend users resolve conflicts by updating libraries as far as possible
    Users can configure to depend on multiple versions of a library only when conflicts cannot be resolved through uncomfortable/un-user-friendly options/commands/etc


I tried to find information on whether our PubGrub implementation would support such behavior but I couldn't. I only found references from Elm and Dart implementations stating that:

> Versions use the semantic versioning scheme (Major.Minor.Patch).
> Packages cannot be simultaneously present at two different versions.

Same with Dart (ref: https://dart.dev/tools/pub/versioning):

> Instead, when you depend on a package, your app only uses a single copy of that package. When you have a shared dependency, everything that depends on it has to agree on which version to use. If they don���t, you get an error.

This would mean that to support any "on conflict" suggestions or resolution, we would have to replace/enhance this engine. Such changes always pose a significant risk of introducing new dependency confusion bugs. On top of that, we need to answer the question of how such multi-versions operations should behave on constraints coming from multiple sources. What if the same "name" comes from two sources, one public and one private? Since we will allow for namespacing, should such a thing be allowed? If so, we may have to update how gems are cached locally to include their full source to avoid name collisions.

Finally, if we go with this:

>  Bundler should resolve dependencies in the same way as the current one

It only mitigates "complete" conflicts but does not prevent from situations where A & B depend on C and are both able to resolve to something old but acceptable. The issue of one of dependencies "limiting" things will still stay.

> How can Gemfile.lock handle this? Probably, the format needs to change significantly to support that?

Yes, though I think it can be done in a way that would be compatible as long as there are no namespaces in use.

> How long might it take to implement all this in RubyGems & Bundler? Until then, this feature is probably unusable for most Rubyists, as most use RubyGems & Bundler.

Great question to David Rodr��guez - I'll ping him.

## Learning from other Registries / Technologies

Before getting "full in" with a feature like this, I think it would be good to research its frequency/scale of usability. Maybe we could get anonymous data on structures of lock files from technologies that support this to analyze the frequency of such feature adoption. We could take both OSS data to understand how often this is being used in packages and get data from actual projects and some viable insights. 

## Summary

I'm hesitant about it.

On one side, I've been missing it a few times myself. On the other hand, I am not sure that this is a feature that will get wide adoption. Once it is in and beyond an experimental phase, it will have to get solid support from Ruby Core, RubyGems, and Bundler. For it to be considered usable, it must have a great user experience around usage and debuggability and solid documentation for users to understand.

> But on the other hand, libraries that are not updated still exist even under very strong pressure. 

Absolutely, and at the same time, some of them get adopted and become maintained again. The pressure will be lowered if such a problem can be "bypassed" by namespaces.

While useful at some times, I do not feel this will get wide adoption, especially as @tagomoris himself said here:

> Bundler should resolve dependencies in the same way as the current one (Most libraries have just 1 version/copy under the vendor)
Bundler should recommend users resolve conflicts by updating libraries as far as possible


----------------------------------------
Feature #19744: Namespace on read
https://bugs.ruby-lang.org/issues/19744#change-104931

* Author: tagomoris (Satoshi Tagomori)
* Status: Open
* Priority: Normal
----------------------------------------
# What is the "Namespace on read"

This proposes a new feature to define virtual top-level namespaces in Ruby. Those namespaces can require/load libraries (either .rb or native extension) separately from the global namespace. Dependencies of required/loaded libraries are also required/loaded in the namespace.

### Motivation

The "namespace on read" can solve the 2 problems below, and can make a path to solve another problem:
The details of those motivations are described in the below section ("Motivation details").

#### Avoiding name conflicts between libraries

Applications can require two different libraries safely which use the same module name.

#### Avoiding unexpected globally shared modules/objects

Applications can make an independent/unshared module instance.

#### (In the future) Multiple versions of gems can be required

Application developers will have fewer version conflicts between gem dependencies if rubygems/bundler will support the namespace on read.

### Example code with this feature

```ruby
# your_module.rb
module YourModule
end

# my_module.rb
require 'your_module'

module MyModule
end

# example.rb
namespace1 = NameSpace.new
namespace1.require('my_module') #=> true

namespace1::MyModule #=> #<Module:0x00000001027ea650>::MyModule (or #<NameSpace:0x00...>::MyModule ?)
namespace1::YourModule # similar to the above

MyModule # NameError
YourModule # NameError

namespace2 = NameSpace.new      # Any number of namespaces can be defined
namespace2.require('my_module') # Different library "instance" from namespace1

require 'my_module' # require in the global namespace

MyModule.object_id != namespace1::MyModule.object_id #=> true
namespace1::MyModule.object_id != namespace2::MyModule.object_id
```

The required/loaded libraries will define different "instances" of modules/classes in those namespaces (just like the "wrapper" 2nd argument of `Kernel.load`). This doesn't introduce compatibility problems if all libraries use relative name resolution (without forced top-level reference like `::Name`).

# "On read": optional, user-driven feature

"On read" is a key thing of this feature. That means:

* No changes are required in existing/new libraries (except for limited cases, described below)
* No changes are required in applications if it doesn't need namespaces
* Users can enable/use namespaces just for limited code in the whole library/application

Users can start using this feature step by step (if they want it) without any big jumps.

## Motivation details

This feature can solve multiple problems I have in writing/executing Ruby code. Those are from the 3 problems I mentioned above: name conflicts, globally shared modules, and library version conflicts between dependencies. I'll describe 4 scenarios about those problems.

### Running multiple applications on a Ruby process

Modern computers have many CPU cores and large memory spaces. We sometimes want to have many separate applications (either micro-service architecture or modular monolith). Currently, running those applications require different processes. It requires additional computation costs (especially in developing those applications).

If we have isolated namespaces and can load applications in those namespaces, we'll be able to run apps on a process, with less overhead.

(I want to run many AWS Lambda applications on a process in isolated namespaces.)

### Running tests in isolated namespaces

Tests that require external libraries need many hacks to:

* require a library multiple times
* require many different 3rd party libraries into isolated spaces (those may conflict with each other)

Software with plugin systems (for example, Fluentd) will get benefit from namespaces.

In addition to it, application tests can avoid unexpected side effects if tests are executed in isolated namespaces.

### Safely isolated library instances

Libraries may have globally shared states. For example, [Oj](https://github.com/ohler55/oj) has a global `Obj.default_options` object to change the library behavior. Those options may be changed by any dependency libraries or applications, and it changes the behavior of `Oj` globally, unexpectedly.

For such libraries, we'll be able to instantiate a safe library instance in an isolated namespace.

### Avoiding dependency hells

Modern applications use many libraries, and those libraries require much more dependencies. Those dependencies will cause version conflicts very often. In such cases, application developers should resolve those by updating each libraries, or should just wait for the new release of libraries to conflict those libraries. Sometimes, library maintainers don't release updated versions, and application developers can do nothing.

If namespaces can require/load a library multiple times, it also enables to require/load different versions of a library in a process. It requires the support of rubygems, but namespaces should be a good fundamental of it.

## Expected problems

### Use of top-level references

In my expectation, `::Name` should refer the top-level `Name` in the global namespace. I expect that `::ENV` should contain the environment variables. But it may cause compatibility problems if library code uses `::MyLibrary` to refer themselves in their deeply nested library code.

### Additional memory consumption

An extension library (dynamically linked library) may be loaded multiple times (by `dlopen` for temporarily copied dll files) to load isolated library "instances" if different namespaces require the same extension library. That consumes additional memory.

In my opinion, additional memory consumption is a minimum cost to realize loading extension libraries multiple times without compatibility issues.

This occurs only when programmers use namespaces. And it's only about libraries that are used in 2 or more namespaces.

### The change of `dlopen` flag about extension libraries

To load an extension library multiple times without conflicting symbols, all extensions should stop sharing symbols globally. Libraries referring symbols from other extension libraries will have to change code & dependencies.

(About the things about extension libraries, [Naruse also wrote an entry](https://naruse.hateblo.jp/entry/2023/05/22/193411).)

# Misc

The proof-of-concept branch is here: https://github.com/tagomoris/ruby/pull/1
It's still work-in-progress branch, especially for extension libraries.




-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/