Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Firestore documents with partially merged fields - Firestore limitation #273

Closed
ZappaUserMan opened this issue Dec 12, 2020 · 3 comments
Assignees
Labels
api: firestore Issues related to the googleapis/python-firestore API. type: question Request for information or clarification. Not an issue.

Comments

@ZappaUserMan
Copy link

I am using the Python library of Firestore to communicate with Firestore.

I have now run into a limitation of Firestore and I am wondering if there is a way around it.

TL;DR / Quickly summarized with outputs: https://i.imgur.com/6WACq8Y.png

Here is a test code to try: https://trinket.io/python/abd1e75cd5

Explanation:
Imagine we have this map / Dict (dictVar1):

dictVar1 = {
    "testArray": ["Yes"],
    "testMap": {
        "test1": 1,
        "test2": 1
    }
}

To begin with, I used to store my testMap in an array, but due to Firestore query limitations (you can only have a single array-contains operation in a query), I changed my structure to a map instead (as you can see in the dictVar1 structure above). If Firestore queries did not have this limitation, I would not change my structure from an array.

Now I am facing another Firestore limitation due to the new structure.

What I would like to do & other conditions:

  1. I want to add this map / dict to a Firestore document.
  2. I would like to do it in one Firestore operation using Firestore batch
  3. I don't know if the document exists or not before updating/creating
  4. One batch can contain anything between 1 and 500 operations
  5. If the document exists, I do not want to remove any other fields from the existing document if these fields are not present in dictVar1 dict / map.
  6. The fields in dictVar1 dict / map should replace the fields in the document completely

So if the existing document would contain this data:

{
    "doNotChange": "String",
    "testMap": {
        "test0": 1
    }
}

It would be updated to ("test0" is removed from the inner map, basically how an array would work):

{
    "doNotChange": "String",
    "testArray": ["Yes"],
    "testMap": {
        "test1": 1,
        "test2": 1
    }
}

And if the document doesn't exist, the document would be set to:

{
    "testArray": ["Yes"],
    "testMap": {
        "test1": 1,
        "test2": 1
    }
}

I see two ways to do this:

  1. Do this in two operations
  2. Instead of using testMap as a map, replace it with an array.

99% of the time the document exists, therefore I am fine with doing this in two operations if the document doesn't exist, but one operation if the document exists.

This could be done using Firestore's update function, but since I am using batch and potentially updating 100s of documents in one batch, if the document doesn't exist, it would ruin the whole batch operation.

Another potential solution would be to:

  1. Run batch with updates, if it succeeds, then great, if 404 (document not found) is raised then:
  2. Change the operation to set instead of an update for this document and then redo the batch, in a loop until the batch is successful

Two potential problems I see with this:

  1. Will I be fully charged for all the failed batch operations or will I be just be charged 1 read per failed batch operation? If I get fully charged for the batch, then this is still not a good solution.
  2. Is it possible to easily change the operation type for a specific document reference to a different operation type without having to recreate the batch operation totally from scratch?

Do you have any ideas on how I could solve one of these problems?

@product-auto-label product-auto-label bot added the api: firestore Issues related to the googleapis/python-firestore API. label Dec 12, 2020
@yoshi-automation yoshi-automation added the triage me I really want to be triaged. label Dec 13, 2020
@tseaver tseaver added type: question Request for information or clarification. Not an issue. and removed triage me I really want to be triaged. labels Dec 15, 2020
@tseaver
Copy link
Contributor

tseaver commented Dec 15, 2020

@ZappaUserMan The Document.set / Batch.set methods take an optional merge parameter.

  • If merge is not passed, set behaves much like a pure "upsert" (create if not exists, else replace whole document).
  • If merge is passed as True, set creates the document if it doesn't exist, but merges the two documents if it does: existing top-level keys which aren't present in the passed data are untouched.
  • If merge is passed as a list of field specifiers, only those fields are updated in existing documents.

I've opened #277 to track clarifying the client library docs for Document.set.

@tseaver
Copy link
Contributor

tseaver commented Dec 15, 2020

Please feel free to re-open / follow up if this answer doesn't work for you.

@tseaver tseaver closed this as completed Dec 15, 2020
@ZappaUserMan
Copy link
Author

ZappaUserMan commented Dec 16, 2020

Hi @tseaver, thank you for your reply. I am not able to re-open the issue as I am not a collaborator of this project.

I already know about set with merge, but that does not help out in this case.

from json import dumps
from google.cloud import firestore

db = firestore.Client.from_service_account_json("firebaseKeysDev.json")

originalDoc = {

    "doNotChange": "String",
    "testMap": {
        "test0": 1
    }
}

dictVar1 = {
    "testArray": ["Yes"],
    "testMap": {
        "test1": 1,
        "test2": 1
    }
}

prefOutput = {
    "doNotChange": "String",
    "testArray": [
        "Yes"
    ],
    "testMap": {
        "test1": 1,
        "test2": 1
    }
}

# Let's first create the document with the original dict / map
originalSetOp = db.collection("test").document("testDoc").set(originalDoc)

# Now let's get the original map / dict from Firestore
originalOpDoc = db.collection("test").document("testDoc").get()
# Convert to Python Dict
originalOpDocDict = originalOpDoc.to_dict()

# Now let's print out the original document dict
print("Here is the original map:")
print(dumps(originalOpDocDict, ensure_ascii=False, sort_keys=True, indent=4))

# Now let's merge the original dict / map with our dictVar1 dict / map
mergeDictVar1WithODoc = db.collection("test").document("testDoc").set(dictVar1, merge=True)

# Now let's get the new merged map / dict from Firestore
newDictDoc = db.collection("test").document("testDoc").get()
# Convert to Python Dict
newDictDocDict = newDictDoc.to_dict()

# Let's print the new merged dict / map
print("\nHere is the merged map:")
print(dumps(newDictDocDict, ensure_ascii=False, sort_keys=True, indent=4))

print("\nHere is the output we want:")
print(dumps(prefOutput, ensure_ascii=False, sort_keys=True, indent=4))

You can try to run that code yourself and see what the problem is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: firestore Issues related to the googleapis/python-firestore API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

3 participants