Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for netstandard2.0 #117

Merged
merged 1 commit into from
Jun 21, 2024

Conversation

maumar
Copy link
Contributor

@maumar maumar commented Jun 13, 2024

First stab at the problem. Still work in progress but looks promising. Pretty much the only break is the explicit interface implementation / static abstract method on the ITransport (needs runtime changes so no way to hack it into ns2) and the new parallelization logic (Parallel.ForEachAsync). Chunk is bit much as well. Apart from that Index and Range are the only relatively complicated code that needed to be ported/copied.

I still haven't run any tests on ns2, but all the tests on 6, 7 and 8 pass without issue.

@neon-sunset
Copy link
Owner

neon-sunset commented Jun 13, 2024

It is unfortunate that SemanticKernel decided to still target NS2.0 as it gives people a wrong idea that picking .NET Framework for new code is something that can be justified outside of writing MS Office Add-Ins or Visual Studio extensions 😢

Initially, I was strictly against the idea of supporting NS2.0 - if someone is writing a standalone application, they must not, under any circumstances, use .NET FW: #46

However, because SK continues to target it, and because, if my understanding is correct, you would like to replace SKs own connector implementation with this library (or make it to be overlayed on top of it), then I suppose there is just no other choice.

In either case, thank you for working on this. There is some feedback:

  • Strange, I don't think you need to define any polyfills - all of them are supposed to be provided by PolySharp already referenced by the project, including Index and Range
  • Ideally, static abstract interface member usage on the TTransport should stay for .NET 7+ targets and both .NET 6 and NS2.0 TFMs can share the implementation that brings in the annotations and Activator construction (with or without DIM that's optional, removing the member is a breaking change however), it should "just work" with PolySharp. The goal is to not introduce trimming regressions and warnings - the project has to be trimming warnings-free for safe AOT compatibility, because with warnings it might work, but this is not guaranteed and may explode later as the trimmer behavior is implementation defined for the code with warnings
  • Feel free to just if-def upsert/fetch parallelization (or you can rewrite it with NS2.0 APIs if you wish so, it's just not as clean), it was a small nice bit where C# shows its strength - add a little code for significant throughput improvement, but I don't know how welcome this kind of magic is for projects of SK level

@maumar
Copy link
Contributor Author

maumar commented Jun 13, 2024

However, because SK continues to target it, and because, if my understanding is correct, you would like to replace SKs own connector implementation with this library (or make it to be overlayed on top of it),

Yes, that's my plan. SK MemoryStore would wrap PineconeClient and delegate all the operations to it. Looks like they standarized on ReadOnlyMemory for the vector representation, so #87 could be valuable at some point, but I believe/hope I can get a buy-in from them with NS2 support alone, given that Pinecone.NET has so much more functionality, decent tests etc. SK APIs can use ReadOnlyMemory and will just extract array or copy into a new one under the hood for now to pass to PineconeClient.

  • Strange, I don't think you need to define any polyfills - all of them are supposed to be provided by PolySharp already referenced by the project, including Index and Range

I didn't include that package when targetting ns2 🤦‍♂️ works like a charm now

  • Ideally, static abstract interface member usage on the TTransport should stay for .NET 7+ targets and both .NET 6 and NS2.0 TFMs can share the implementation that brings in the annotations and Activator construction (with or without DIM that's optional, removing the member is a breaking change however), it should "just work" with PolySharp. The goal is to not introduce trimming regressions and warnings - the project has to be trimming warnings-free for safe AOT compatibility, because with warnings it might work, but this is not guaranteed and may explode later as the trimmer behavior is implementation defined for the code with warnings

Makes sense, I just hacked it really quickly to see if there are any insurmountable problems further down the line. Will do proper implementation and clear out AOT warning in the actual PR.

  • Feel free to just if-def upsert/fetch parallelization (or you can rewrite it with NS2.0 APIs if you wish so, it's just not as clean), it was a small nice bit where C# shows its strength - add a little code for significant throughput improvement, but I don't know how welcome this kind of magic is for projects of SK level

Will do!

@maumar maumar marked this pull request as ready for review June 14, 2024 00:35
src/Index.cs Show resolved Hide resolved
@neon-sunset
Copy link
Owner

neon-sunset commented Jun 14, 2024

It appears that SK is now multi-targeted thanks to microsoft/semantic-kernel#4308

This is important as this means it will be able to take advantage of .NET 8 package.

If SK were to integrate this, there are other areas that might be worth investigating (I'm not saying you need to do that, just noting them for posterity):

  • Make sure the package versions are in sync with SK's centralized package management versions, to avoid a downgrade
  • Investigate how SK consumes async iterators, quickly skimming through PineconeDocument shows that it does not mesh well together with how Pinecone API works and how this library exposes it. There isn't much we can do about it besides Fetch and Upsert but these can be streamed back into IAsyncEnumerable SK expects comparatively cleanly when doing parallelizing batches
  • Look into consuming logger factory by the client and propagating it down as needed
  • Consider replacing float[] with ReadOnlyMemory<float> to avoid reallocations or unsafe - as it would be by far the heaviest allocation (now that some cases use 3072-dimension vectors, that would be 12KiB per single vector!
  • SK keeps hurting itself with single-line return awaits. Runtime-handled tasks can't come soon enough to do away with all that nonsense. This is not actionable but it makes me sad every time.

@maumar
Copy link
Contributor Author

maumar commented Jun 14, 2024

another thing is Logging. It shouldn't be a blocker, since logging can just happen at PineconeMemoryStore level, but it might be worth considering exposing a logger/logger factory in Pinecone.NET

@neon-sunset
Copy link
Owner

Thanks, I have updated previous comment (will move them to issues but a bit a later).

Features that are used in the project and not available on NS2 (and not covered by PolySharp) have been added to the Compatibility folder.
Parallel Upsert and Fetch are currently disabled on NS2, everything else should work.

Fixes neon-sunset#116
@maumar
Copy link
Contributor Author

maumar commented Jun 20, 2024

@neon-sunset I met @westey-m from Semantic Kernel yesterday and they are OK with using Pinecone.NET to manage operations for the memory connector. They are currently in the process of re-designing the memory connector APIs, so breaking changes are not as big a problem as usually. Fortunate timing :)

@neon-sunset
Copy link
Owner

Awesome, thanks! If that is the case, it's probably better to release NS2.0 version as-is, and then, once SK decides on the new memory connector shape, Pinecone.NET could adapt accordingly, releasing version 3.0 with breaking changes if needed.

PR changes LGTM.

@neon-sunset neon-sunset merged commit 1ba8c2c into neon-sunset:main Jun 21, 2024
1 check failed
@maumar
Copy link
Contributor Author

maumar commented Jun 21, 2024

Makes sense. AFAICT the API is still in flux, but the general shape and design goals are somewhat baked in. I will take a stab at the new implementation using Pinecone.NET based on their feature branch, see how it goes and report feature requests here: #119

@maumar maumar deleted the netstandard2_support branch June 21, 2024 06:30
@neon-sunset
Copy link
Owner

https://www.nuget.org/packages/Pinecone.NET/2.1.1 has been published to nuget

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants