Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cellsize returns erroneous results for grids with fine spacing [no reprojeciton] #764

Open
alex-s-gardner opened this issue Sep 25, 2024 · 31 comments

Comments

@alex-s-gardner
Copy link
Contributor

alex-s-gardner commented Sep 25, 2024

This makes sense as gridsize is increasing linearly with latitude

using Rasters
using Rasters.Lookups
import GeoFormatTypes as GFT
using Plots

dX = 0.1
dY = -0.1
lon = X(Projected(166.:dX:168.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY ), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])
Screenshot 2024-09-25 at 1 19 39 PM

if we decrease the step size in X we start to see some jutter

dX = 0.0006
dY = -0.1
lon = X(Projected(166.0:dX:168.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])
Screenshot 2024-09-25 at 1 24 47 PM

and things really fall apart when we decrease step size in X and Y

dX = 0.0006
dY = -0.0003
lon = X(Projected(166.0:dX:167.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-79.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
cell_area = Rasters.cellsize(ras)
plot(cell_area[1, :])
Screenshot 2024-09-25 at 1 23 37 PM
@asinghvi17
Copy link
Contributor

@alex-s-gardner can you confirm this is on v0.11.8 of Rasters.jl?

@alex-s-gardner
Copy link
Contributor Author

Yes, it is v0.11.8

@asinghvi17
Copy link
Contributor

asinghvi17 commented Sep 26, 2024

I wonder if we need to densify and subsample points on the grid? I would expect that to have an effect as we increase dx and dy though, not as we decrease...

GI.LinearRing([
(xb[1], yb[1]),
(xb[2], yb[1]),
(xb[2], yb[2]),
(xb[1], yb[2]),
(xb[1], yb[1])
]))

nevermind, I just saw that the crs there is in long-lat. It might be something with the circular area calculation then?

@alex-s-gardner
Copy link
Contributor Author

Seems like a precision issue to me... is Float64 geometry being downgraded anywhere?

@asinghvi17
Copy link
Contributor

No, but it is being converted to radians and then multiplied by pi

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Any ideas @tiemvanderdeure ?

@tiemvanderdeure
Copy link
Contributor

Thanks for pointing this out. I don't know exactly but it seems like something is being rounded somewhere?

Anyway, it's been on my list for a little while to make a specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells. That should be much faster and avoids this type of error.

The current implementation just treats each cell as a linearring, which makes it super generic but slow and (apparently) introduces some precision issues.

@asinghvi17
Copy link
Contributor

asinghvi17 commented Sep 27, 2024

specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells

Could that also apply to any CRS that can map to lat/long via mapcrs?

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Yes we should be able to do the conversion for any grid aligned crs

@tiemvanderdeure
Copy link
Contributor

specialised implementation for lon-lat grids that calculates the size of each latitudinal band in a grid and then divides it up between the cells

Could that also apply to any CRS that can map to lat/long via mapcrs?

It depends on this line, which (I think) should return true for any lat/long crs:

if convert(CoordSys, crs(dims)) == CoordSys("Earth Projection 1, 104") # check if need to reproject

But there may be a better way to test this?

@rafaqz
Copy link
Owner

rafaqz commented Sep 28, 2024

For mappedcrs we test if changing the projection of two points with the same X and different Y gets points with the same X. But then the user has specified the conversion explicitly and we are just checking it's not really wrong, so its not the same situation.

@tiemvanderdeure
Copy link
Contributor

Okay, I've never used mappedcrs and don't think I've ever seen it in the wild, so just trying to wrap my head around this.

For mappedcrs we test if changing the projection of two points with the same X and different Y gets points with the same X.

Are there ever cases where that holds for two projections with a different CoordSys?

I know that there are some projections which use lat/lon but are shifted/rotated compared to EPSG:4326, but I think all of those have Earth Projection 1. My thinking was that, since we're assuming the world is spherical anyway, we don't need to do any reprojecting in those cases.

@rafaqz
Copy link
Owner

rafaqz commented Sep 28, 2024

I don't really know CoordSys well. But I imagine yes? Anything grid aligned can be converted to another with independent transformations of X and Y. So e.g. a cylindrical equal area projection can be converted to lat/lon by independently transforming it's xs and ys. In lat/lon X will have regular steps, Y will be irregular. Rasters.reproject should try to do this for you.

But it means you can use the same math you are using for lat/lon after the transformation of the lookups.

@tiemvanderdeure
Copy link
Contributor

I don't really know CoordSys well.

I don't either and I can't really find much about it, so if you know a better check I'd be happy.

So basically we can just check if the mappedcrs is lon-lat and if it reproject the lookups and calculate the cellsize for those. That makes sense to me.

@rafaqz
Copy link
Owner

rafaqz commented Sep 28, 2024

Yeah. There is probably a more correct way... We need Julia crs parsers that give you more useful information more easily.

@tiemvanderdeure
Copy link
Contributor

So e.g. a cylindrical equal area projection can be converted to lat/lon by independently transforming it's xs and ys.

Do you have an example of a raster like this, so I can add a test? I can't find any in the docs.

@rafaqz
Copy link
Owner

rafaqz commented Sep 28, 2024

You can resample to it?

@asinghvi17
Copy link
Contributor

I just came across google/s2geometry#190, it seems pretty insightful - maybe we should use one of the approaches there, or convert early to completely spherical (x, y, z) Cartesian points?

@asinghvi17
Copy link
Contributor

With the PR, this works correctly for lat/long. But if I resample the last example raster to web mercator, I get this:

hello

@tiemvanderdeure
Copy link
Contributor

tiemvanderdeure commented Oct 3, 2024

I just came across google/s2geometry#190, it seems pretty insightful - maybe we should use one of the approaches there, or convert early to completely spherical (x, y, z) Cartesian points?

But even in that thread they mention their fallback is Girard's theorem, which is exactly what we use.

In this blogpost: https://www.johndcook.com/blog/2021/11/29/area-of-spherical-triangle/, Erikson's formula was basically equal to the estimate using just planar geometry for the area between some cities in Texas: "The triangle is so flat that numerical error has a bigger effect than the curvature of the earth." And here we're talking about the error that becomes relevant when gridcells are well below 1 km^2.

We actually already test for an equal-area projection with grid spacing of 100x100m (the test is if all grid cells are within 1% of 0.1 km^2).

My vote is for assuming flat space whenever grid cells get much smaller than that and just using the haversine formula.

With the PR, this works correctly for lat/long. But if I resample the last example raster to web mercator, I get this:

Could you copy the code?

@tiemvanderdeure
Copy link
Contributor

Okay thinking about this some more, I'm not sure if the Haversine formula is numerically stable at this grid spacing either. It also involves a bunch of square roots and sin and cos

@tiemvanderdeure
Copy link
Contributor

tiemvanderdeure commented Oct 3, 2024

Did some more testing and I can report back that

  • yes, this is a numeric stability issue
  • yes, Eriksson's formula performs better in this particular example

I did this (_linearring_area is the current implementation):

xb = (166.0, 166.0006)
yb = (-78.0, -78.0003)

ring = [
    (xb[1], yb[1]), 
    (xb[2], yb[1]), 
    (xb[2], yb[2]), 
    (xb[1], yb[2]),
    (xb[1], yb[1])
]

ring_bf = map(p -> BigFloat.(p), ring)

eriksson(GI.LinearRing(ring))
_linearring_area(GI.LinearRing(ring); radius = 1)

eriksson(GI.LinearRing(ring_bf))
_linearring_area(GI.LinearRing(ring_bf); radius = 1)

which returns

julia> eriksson(GI.LinearRing(ring))
1.1399895306967292e-11

julia> _linearring_area(GI.LinearRing(ring); radius = 1)
2.2859936166241823e-11

julia> eriksson(GI.LinearRing(ring_bf))
1.139989369278327047449715369542151261981068071545916771493908664977035933833957e-11

julia> _linearring_area(GI.LinearRing(ring_bf); radius = 1)
1.140013862214309994513259890860796762002232266534835233057189872420615878310853e-11

So: when we use big floats both methods return the same answer, and this answer corresponds to what Eriksson's formula returns using Float64.

I implemented Eriksson's formula like this. It has all these dot products and cross products so I think it's going to allocate no matter what.

function eriksson(ring)
    points = GI.getpoint(ring)
    cartesian_points = lonlat_to_cartesian.(points)
    area = 0.0
    area += eriksson_triangle(cartesian_points[1], cartesian_points[2], cartesian_points[3])
    area += eriksson_triangle(cartesian_points[3], cartesian_points[4], cartesian_points[1])
    return area
end

function eriksson_triangle(a, b, c)
    t = abs(dot(a, cross(b, c)))
    t /= 1 + dot(b,c) + dot(c, a) + dot(a, b)

    # alternative that could be more precise but allocates more
    # t = abs(dot(a, (cross(b .- a, c .- a))) / dot(b .+ a, c .+ a))

    2*atan(t)
end

lonlat_to_cartesian(args) = lonlat_to_cartesian(args...)
function lonlat_to_cartesian(lon, lat)
    x = cosd(lat) * cosd(lon)
    y = cosd(lat) * sind(lon)
    z = sind(lat)
    return [x, y, z]
end

@tiemvanderdeure
Copy link
Contributor

Just some observations from testing in this way:

  • calculation with Girard's theorem starts misbehaving when the height is lower than 0.001
  • it is more unstable for high/low latitudes

So the options are:

  • always use eriksson
  • implement both and let users decide
  • implement both and automatically switch when the area is less than let's say 1e-6 square degrees.

@asinghvi17
Copy link
Contributor

asinghvi17 commented Oct 3, 2024

I implemented Eriksson's formula like this. It has all these dot products and cross products so I think it's going to allocate no matter what.

I usually use static vectors or tuples to get around that, you could probably try that, but StaticArrays is a pretty heavy dependency, and tuples don't have all the nice LinearAlgebra dispatches you'd need.

Although you could define a SphericalPoint type here that encodes those specific dispatches in a way where size is known at compile-time...that sounds like a lot of work :D

@tiemvanderdeure
Copy link
Contributor

Yes or we can define a _dot and _cross that work on tuples of size 3 :)

@asinghvi17
Copy link
Contributor

Here's a super basic implementation that should get you what you need:

using LinearAlgebra

struct SphericalPoint{T <: Real}
	data::NTuple{T, 3}
end

SphericalPoint(x, y, z) = SphericalPoint((x, y, z))

# define the 4 basic mathematical operators elementwise on the data tuple
Base.:+(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .+ q.data)
Base.:-(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .- q.data)
Base.:*(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data .* q.data)
Base.:/(p::SphericalPoint, q::SphericalPoint) = SphericalPoint(p.data ./ q.data)
# Define sum on a SphericalPoint to sum across its data
Base.sum(p::SphericalPoint) = sum(p.data)

# define dot and cross products
LinearAlgebra.dot(p::SphericalPoint, q::SphericalPoint) = sum(p * q)
function LinearAlgebra.cross(a::SphericalPoint, b::SphericalPoint)
	a1, a2, a3 = a
    b1, b2, b3 = b
	SphericalPoint((a2*b3-a3*b2, a3*b1-a1*b3, a1*b2-a2*b1))
end

@tiemvanderdeure
Copy link
Contributor

tiemvanderdeure commented Oct 3, 2024

okay let me benchmark this!

EDIT:
not bad!

julia> @btime eriksson($ring)
  1.045 μs (1 allocation: 176 bytes)
1.1399895306958715e-11

julia> @btime _linearring_area($ring; radius = 1)
  2.050 μs (0 allocations: 0 bytes)
2.2859936166241823e-11

EDIT2:
even better after I got rid of the final allocation

julia> @btime eriksson($ring)
  678.571 ns (0 allocations: 0 bytes)
1.1399895306958715e-11

@asinghvi17
Copy link
Contributor

Hmm, curious why there's still an allocation though

@tiemvanderdeure
Copy link
Contributor

Hmm, curious why there's still an allocation though

because of this line, which isn't so hard to get rid of.

    cartesian_points = lonlat_to_cartesian.(points)

@tiemvanderdeure
Copy link
Contributor

Also just reading back in google/s2geometry#190, and the reason they cite for needing to keep Girard's Theorem around is for points that are antipodal, which I don't think is ever going to happen with a Raster, so I think we can safely go for the option to just always use Eriksson's formula

@tiemvanderdeure
Copy link
Contributor

I now implemented this in the same PR: #768

Let's continue the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants