Skip to content
This repository has been archived by the owner on Feb 9, 2022. It is now read-only.

Generic string interning package to help memory & perf. #96

Merged
merged 1 commit into from
Dec 22, 2016
Merged

Generic string interning package to help memory & perf. #96

merged 1 commit into from
Dec 22, 2016

Conversation

geeknoid
Copy link
Contributor

@geeknoid geeknoid commented Dec 22, 2016

We can leverage this in a number of places to keep our memory
footprint in check.


This change is Reviewable

@douglas-reid
Copy link
Contributor

On mobile, will look in more detail later. Wanted to mention: golang/go#5160 and ask if you had looked at the solution mentioned in that thread (and how this compares).

"sync"
)

type Pool struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems fine. I guess we could also use sync.Pool (a la the example from the other issue), and not worry about the mutex'ing here. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution mentioned has the unfortunate effect of zapping the table when the GC decides it needs more memory, rather than on a specific size limit. This means the table can potentially get very big indeed. When multiple components use a similar technique in a process, it means we don't get to decide what feature needs more memory compared to another.

So I like my approach better...

"sync"
)

type Pool struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably document Pool, as it is exported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

We really need to get lint checking working in the build...

type Pool struct {
sync.RWMutex
strings map[string]string
maxCount int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxCount is one way to limit the pool. would we ever consider a size-defined limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent idea, switched to maxSize instead.

@geeknoid
Copy link
Contributor Author

PTAL


// We're only testing the semantics here, we're not actually
// verifying that interning has taken place. That's too hard to
// do in Go...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can actually write a test that checks for the purpose.
Here is how I have done it in the prototype.

mixologist/cp/whitelist/whitelist_test.go#67

import (
   g "github.com/onsi/gomega"
)
func TestWhiteListUnload(t *testing.T) {
	g.RegisterTestingT(t)
	wl, err := build("")
	g.Expect(err).To(g.BeNil())
	var finalized atomic.Value
	finalized.Store(false)
	// check adapter is eventually unloaded
	// This test ensures that after unload is called
	// the adapter is removed from memory
	runtime.SetFinalizer(wl, func(ff interface{}) {
		finalized.Store(true)
	})
	wl.Unload()
	runtime.GC()
	g.Eventually(func() bool {
		runtime.GC()
		return finalized.Load().(bool)
	}).Should(g.BeTrue(), "Adapter was not fully unloaded")
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played with that stuff and it unfortunately doesn't work for strings.

The point here would be to know whether a string gets collected or not. Unfortunately, calling SetFinalize on a string gives a runtime error because a string is a value types and not a pointer. I'd really need to be passing the underlying uintptr which I don't have access to.

Anyway, the library code is so simple, I'm pretty sure it's behaving as it should.

@douglas-reid
Copy link
Contributor

:LGTM: This LGTM. There may be some clean up on the comments, but otherwise, this seems ok.


Reviewed 2 of 3 files at r1, 1 of 1 files at r2.
Review status: all files reviewed at latest revision, 5 unresolved discussions.


pkg/intern/intern.go, line 23 at r1 (raw file):

Previously, geeknoid (Martin Taillefer) wrote…

Done.

We really need to get lint checking working in the build...

Yeah, it is strange that golint didn't generate any output for this in the build for this PR. Did running it yourself generate any issues?


pkg/intern/intern.go, line 33 at r2 (raw file):

const (
	// just a guess of course...
	averageStringLength = 10

nit: elsewhere, a comment reads maxSize/16 where the code is maxSize/averageStringLength. Should this be 16, or should that comment be updated?


pkg/intern/intern.go, line 50 at r2 (raw file):

// this approach of course is that interning will be less efficient.
func NewPool(maxSize int) *Pool {
	// We use maxSize/16 to preallocate space in the map which assumes an average string length of 8 characters

i think this comment needs updating based on the const "averageStringLength"


Comments from Reviewable

We can leverage this in a number of places to keep our memory
footprint in check.
@geeknoid
Copy link
Contributor Author

Review status: 2 of 3 files reviewed at latest revision, 4 unresolved discussions.


pkg/intern/intern.go, line 23 at r1 (raw file):

Previously, douglas-reid (Douglas Reid) wrote…

Yeah, it is strange that golint didn't generate any output for this in the build for this PR. Did running it yourself generate any issues?

Done.


pkg/intern/intern.go, line 26 at r1 (raw file):

Previously, geeknoid (Martin Taillefer) wrote…

Excellent idea, switched to maxSize instead.

Done.


pkg/intern/intern.go, line 33 at r2 (raw file):

Previously, douglas-reid (Douglas Reid) wrote…

nit: elsewhere, a comment reads maxSize/16 where the code is maxSize/averageStringLength. Should this be 16, or should that comment be updated?

Done.


Comments from Reviewable

@douglas-reid
Copy link
Contributor

Reviewed 1 of 1 files at r3.
Review status: all files reviewed at latest revision, 2 unresolved discussions.


Comments from Reviewable

@geeknoid
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 2 unresolved discussions.


pkg/intern/intern.go, line 50 at r2 (raw file):

Previously, douglas-reid (Douglas Reid) wrote…

i think this comment needs updating based on the const "averageStringLength"

Done.


Comments from Reviewable

@mandarjog
Copy link
Contributor

:lgtm:


Comments from Reviewable

Copy link
Contributor

@mandarjog mandarjog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@douglas-reid douglas-reid merged commit d8ac400 into istio:master Dec 22, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants