Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Lamp not booting up after flashing with latest ESPHome #104

Closed
hellcry37 opened this issue Dec 14, 2022 · 38 comments · Fixed by esphome/esphome#4204
Closed

[BUG] Lamp not booting up after flashing with latest ESPHome #104

hellcry37 opened this issue Dec 14, 2022 · 38 comments · Fixed by esphome/esphome#4204

Comments

@hellcry37
Copy link

Describe the bug
After flashing latest esphome 2022.12 lamp fails to boot or not connectable. Lamp is not accessible in ha anymore, ip can not be reached.

validation get's me this:
WARNING GPIO12 is a Strapping PIN and should be avoided.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
WARNING GPIO4 is a Strapping PIN and should be avoided.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins

To Reproduce
Steps to reproduce the behavior:

  1. updated from 2022.11.5 to 2022.12.0

Expected behavior
Lamp would work normal

Additional context
none

Please investigate if it's a problem with new esphome.

@randybb
Copy link

randybb commented Dec 14, 2022

Ouu. Maybe the same issue (not related to this package) that I had with my lamps when updated to 2023.01dev. Could you please connect via uart to it and pull logs?
If it is crashing during wifi init, then check this thread https://discord.com/channels/429907082951524364/1050868163467812914 - there is a simple solution which doesn't make sense but it works :D

@hellcry37
Copy link
Author

I am not at home, i'd have to open up the lamp again, will try tonight

@hellcry37
Copy link
Author

I've read the discord did not see any simple solution, care to elaborate on this solution?

@randybb
Copy link

randybb commented Dec 14, 2022

If it is the same problem then you just need to comment 4 lines in your yaml, compile&flash, then uncomment them, compile&flash and it will work again.

@vta-github
Copy link

@randybb -> may you please specify the 4 lines?

@randybb
Copy link

randybb commented Dec 14, 2022

If it is related to the same issue - my log is here, then you need to comment these 4 lines, install it, then uncomment them and install again.
obrázok

@hellcry37
Copy link
Author

yah i do not have that in the conf, prob is pulled from this repo, and I try to re flash the device but is dead, i'll buy another

@randybb
Copy link

randybb commented Dec 14, 2022

@cracrama
Copy link

cracrama commented Dec 14, 2022

Same here - lamps completelly bricked. So to pull that trick up with "uncommenting lines" we need to open device and connect via serial?

@cracrama
Copy link

Hello is there any solution for this?

@mmakaay
Copy link
Owner

mmakaay commented Dec 16, 2022

I upgraded three lamps myself and they (unfortunately) kept working.

@hellcry37 The device should not be completely dead, no need to buy another. The great thing with the ESP32 is that really bricking it would be quite a feat. One can always flash clean firmware on it to get it going.

The logging that you showed about the strapping pins can be fully ignored. When building your own projects from scratch, these warnings are good, since they make you aware that you might be using pins that could result in unexpected behaviour. For this lamp however, the hardware is as-is and the designers chose to use those pins on purpose. And the lamp works with those pins. So, thank you ESPHome for being friendly by warning us, but we'll ignore those messages.

Side note: One thing that I have in mind for ESPHome, is to implement an option for the pin definition, that can be used to suppress these warnings, so people don't get thrown off by it when compiling the firmware.

For the trick as described by @randybb, you'd indeed need to open the lamp and flash it via serial. He flashed once with a firmware that didn't have those four lines (which basically makes it broken firmware for the type of hardware), and then once again with the four lines enabled again. After that, the lamp started working for him.

As @randybb already stated, it would be interesting to get a dump of the logging that you see on the serial output when booting up the lamp. Mainly to check if it fails in the same spot as for him.

I will try to break my development lamp by downgrading and upgrading it a few times. See if I can hit the issue myself, so I can try to debug this behaviour.

@mmakaay mmakaay changed the title [BUG] [BUG] Lamp not booting up after flashing with latest ESPHome Dec 16, 2022
@szafran81
Copy link

szafran81 commented Dec 16, 2022

I also have currently 3 dead lamps at home. Unfotunatelly it'll be some time untill I'll be able to take care of this problem (have 2 ongoing projects right now - and I wan't to at least finish them partially before I go and dig in into something else).

esphome 2022.12.1 just dropped in. Can anyone with working (or previously dead because the 2022.12.0 update and now fixed) try if something changed with this problem?

@mmakaay
Copy link
Owner

mmakaay commented Dec 17, 2022

One thing from the 2022.12 Changelog that could be fishy is:

Along with some of these bluetooth changes is a change to the underlying flash partition table that ESPHome uses. OTA will work, but to fully take advantage of the performance increases for bluetooth, it is best to at least one serial flash with ESPHome 2022.12.0 or later.

That is the only bit that tickles my spider sense.

Up to now, I have only tried upgrading my lamps to the dev bleeding-edge code. I will try to find some time the upcoming days to do some downgrades to various versions, followed by an upgrade to 2022.12.0 specifically, to see if I can find a reproduction path.

@mmakaay
Copy link
Owner

mmakaay commented Dec 17, 2022

Hurray! I was able to get my lamp into the "bricked" state as well, by first flashing it using ESPHome 2022.11.0 and then flashing it using ESPHome 2022.12.0. And it indeed was not obvious how to get it back in working order. The good news is: it's not impossible to get it working again 😃

I did many upgrade scenarios, and found that the problem occurs when upgrading from 2022.11.0 to a version of ESPHome after commit #3565: "Update ESP-IDF and platform version". For example ESPHome version 2022.12.0 includes this change.

Recipe for fixing the bricked state

The following steps helped me fix my lamp status.

Step 1: Update the config

Add the following block of YAML code to your device YAML file.
I don't think the order would matter, but I put it after the packages: section.

esp32:
  framework:
    sdkconfig_options:
      CONFIG_FREERTOS_UNICORE: n

Note: When you already have an esp32: section in your configuration already, then apply above setting values to it instead (keeping whatever other settings you have in there).

This change makes the produced firmware fully incompatible with the lamp hardware, because it builds a multi-core firmware for the single-core lamp. It seems however that this is the easiest way to get it ready for the next step. Thanks to @randybb for finding this very peculiar fixing step 👍

Step 2: Compile and flash the firmware via serial

Connect the lamp to the serial port of a computer, bring it into flashing mode by plugging in the power while connecting GPIO0 to GND, and flash the new firmware onto it. After completing the flashing operation, disconnect and reconnect the lamp power.

After this, the lamp will not work, but if you look at the serial logging output, you will see something different than the boot loop from before. It will now likely complain with "Running on single core variant of a chip", but I have also seen another pattern without the single core error. Both were fine for the next step.

Step 3: Bring back the configuration to the old state

Either remove the code that you added, or change your esp: config section to use the settings:

esp32:
  framework:
    sdkconfig_options:
      CONFIG_FREERTOS_UNICORE: y

Step 4: Compile and flash the firmware via serial

Again flash the firmware to the device using serial and unplug and replug the power afterwards.
This should bring your device back into working order.

mmakaay pushed a commit that referenced this issue Dec 18, 2022
@mmakaay
Copy link
Owner

mmakaay commented Dec 18, 2022

Hot fix implemented in component version 2021.10.0

To prevent others from running into the same issue, I hot-fixed the core.yaml configuration package in the latest version of my repo. I updated the esp32: section to force the old framework version. That effectively prevents further accidents.
I'm also preparing a new version 2022.12.0, with the same fix applied to it, to communicate that for ESPHome 2022.12.0 a new version of my firmware code is required.

Some details about the crash that occured after upgrading

When the firmware is broken, this is the backtrace that the system crashes on:

[23:20:22][V][esp-idf:000]: I (992) phy_init: phy_version 4670,719f9f6,Feb 18 2021,17:07:07
[23:20:22]
[23:20:22][V][esp-idf:000]: W (993) phy_init: failed to load RF calibration data (0x1102), falling ba
[23:20:22]abort() was called at PC 0x400f3e53 on core 0
[23:20:22]
[23:20:22]
[23:20:22]Backtrace:0x400823ee:0x3ffc33a00x400887b1:0x3ffc33c0 0x4008e95e:0x3ffc33e0 0x400f3e53:0x3ffc3450 0x40148d38:0x3ffc3490 0x40148dfd:0x3ffc34c0
 0x40127eae:0x3ffc34e0 0x40128621:0x3ffc3500 0x40126f08:0x3ffc3520 0x400902d9:0x3ffc3540
WARNING Found stack trace! Trying to decode it
WARNING Decoded 0x400823ee: panic_abort at /Users/mauricem/.platformio/packages/framework-espidf/components/esp_system/panic.c:402
WARNING Decoded 0x400887b1: esp_system_abort at /Users/mauricem/.platformio/packages/framework-espidf/components/esp_system/esp_system.c:128
WARNING Decoded 0x4008e95e: abort at /Users/mauricem/.platformio/packages/framework-espidf/components/newlib/abort.c:46
WARNING Decoded 0x400f3e53: esp_efuse_mac_get_default at /Users/mauricem/.platformio/packages/framework-espidf/components/esp_hw_support/mac_addr.c:13
6
 (inlined by) esp_efuse_mac_get_default at /Users/mauricem/.platformio/packages/framework-espidf/components/esp_hw_support/mac_addr.c:106
WARNING Decoded 0x40148d38: esp_phy_load_cal_and_init at /Users/mauricem/.platformio/packages/framework-espidf/components/esp_phy/src/phy_init.c:714
WARNING Decoded 0x40148dfd: esp_phy_enable at /Users/mauricem/.platformio/packages/framework-espidf/components/esp_phy/src/phy_init.c:236
WARNING Decoded 0x40127eae: wifi_hw_start
WARNING Decoded 0x40128621: wifi_start_process
WARNING Decoded 0x40126f08: ieee80211_ioctl_process
WARNING Decoded 0x400902d9: ppTask

When following this backtrace, the problem lies in the fact that the new version of the ESP-IDF framework is calling esp_efuse_mac_get_default(). That will fail for this lamp, because the default MAC address as burnt into the device has a wrong checksum burnt alongside it. This invalid checksum causes the panic, that leads to the reboot.

For earlier versions of ESP-IDF, I implemented the feature to ignore invalid MAC address checksums. As a matter of fact, when looking at the failing boot log, you can see that this code is actually being hit:

[23:20:21][C][wifi:037]: Setting up WiFi...
[23:20:21][C][wifi:038]:   Local MAC: 54:48:E6:7D:52:C0
[23:20:21][V][wifi_esp32:120]: Use EFuse MAC without checking CRC: 54:48:E6:7D:52:C0

So here I do read the MAC address myself, ignoring the wrong checksum, and I feed it to the WiFi stack as the MAC address to use for connecting to the network.

In the framework version that was used with 2022.11.0, this step was enough to keep the ESP-IDF framework from looking up the burnt in MAC address and checksum itself. It used the MAC address that I fed to it on beforehand.

Since that doesn't apparently prevent the framework from looking up the MAC address, I'll have to dig into the ESP-IDF framework to see if there's a way to prevent the invalid lookup and following panic.

Oh boy :-)

@mmakaay
Copy link
Owner

mmakaay commented Dec 18, 2022

For the ESP-IDF framework, it was considered a good idea to deprecate the option CONFIG_ESP32_PHY_CALIBRATION_AND_DATA_STORAGE in favour of the new option CONFIG_ESP_PHY_CALIBRATION_AND_DATA_STORAGE. This option is one of the things that I added to make the "ignore mac CRC" feature work.

So looks like the real fix would be to update ESPHome to configure both the options, so the code will work for both old and new ESP-IDF versions.

@mmakaay
Copy link
Owner

mmakaay commented Dec 19, 2022

PR submitted for ESPHome

A PR for ESPhome was submitted for the issue: esphome/esphome#4204
I changed the code to use the correct sdkconfig option, based on the version of the framework that is used for compilation. I tested an upgrade from 2022.11.0 to latest dev with my fix, and that worked correctly.

Steps forward from here

  • For now, I will have to keep the hot fix solution in the repository. I will release a version 2022.12.0 of my firmware, which includes the hot fix (i.e. forcing an older framework version) and some documentation on unbricking a "bricked" device. Quoted "bricked", since the lamp can be saved by means of a little serial flashing dance,.

  • Once the PR is included in a stable ESPHome release, we can depend on it, and I will release yet another version of my firmware. In that version, the new ESP-IDF framework can then safely be used.

  • I created a feature request for the ESP-IDF framework, in which I propose a feature that would fix MAC address / CRC issues at the root of the problem. Let's hope the proposal will be picked up.

@hellcry37
Copy link
Author

hellcry37 commented Dec 20, 2022

@hellcry37 The device should not be completely dead, no need to buy another. The great thing with the ESP32 is that really bricking it would be quite a feat. One can always flash clean firmware on it to get it going.

It is only my fault I totally brick the lamp because soldered some pins and I broke 2 solder points. Now I dont have an RX and one ground I think, I dont know where else I can get some points for those two.

Just to be safe if I flash second lamp wich is on 2022.11.5 now with this conf will be ok?

# --------------------------------------------------------------------------
# Substitutions
#
# These are substitutions as used by the configuration packages from below.
# You can uncomment and update the ones that you want to modify.
# --------------------------------------------------------------------------

substitutions:
  name: bedside-left-lamp
  friendly_name: 'Bedside Left Lamp'
  light_name: ${friendly_name}
  light_mode_text_sensor_name: ${friendly_name} Light Mode
  default_transition_length: 200ms

# --------------------------------------------------------------------------
# Load configuration packages
#
# These provide a convenient way to compose your device configuration from
# some functional building blocks. Pick and mix the blocks that you need.
#
# For customization you can override options in your config or you can
# copy the contents of these packages directly in your config file as
# an example for your own customizations.
#
# Available packages are:
# - core.yaml                : core components & hardware setup
# - behavior_default.yaml    : default device behavior
# - ota_feedback.yaml        : enable visual feedback during OTA updates
# - activate_preset_svc.yaml : 'activate_preset' service for Home Assistant
# --------------------------------------------------------------------------

packages:
  bslamp2:
    url: https://github.com/mmakaay/esphome-xiaomi_bslamp2
    ref: release/2022.12.0
    files:
      - packages/core.yaml
      - packages/behavior_default.yaml
      - packages/ota_feedback.yaml
      - packages/activate_preset_svc.yaml
    refresh: 0s

# --------------------------------------------------------------------------
# Use your own preferences for these components.
# --------------------------------------------------------------------------

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  domain: .home
  manual_ip:
    static_ip: 192.168.1.33
    gateway: 192.168.1.1
    subnet: 255.255.255.0
    dns1: 192.168.1.2
    dns2: 192.168.1.3
  ap:
    ssid: "${friendly_name}"
    password: !secret default_fallback_ap_pass

api:
  password: !secret bslamp_home_assistant_api_password
  encryption:
    key: !secret bslamp_home_assistant_encryption_key

ota:
  password: !secret bslamp_ota_password

@hellcry37
Copy link
Author

so if i post what pins I f**ed anyone can point out some pics or so where I should get them back so i can serial flash it?

@randybb
Copy link

randybb commented Dec 20, 2022

If you broke these big pads and you are not able to trace them (or just take esp32 pinouts), I don't think you will be able to solder to even smaller pads.

@hellcry37
Copy link
Author

hellcry37 commented Dec 20, 2022

i broke the one in pictures yes the big ones, maybe there is a chance to pint them on other parts?
I'll try what you gave me this days maybe I manage to fix it

@mmakaay
Copy link
Owner

mmakaay commented Dec 20, 2022

If more support is needed on this, please don't follow up in this issue. The issue is specifically about the ESPHome 2022.12.0 breakage, which involves a general issue.

@vil1driver
Copy link

Hi, juste to say thanks, my first try today with esphome (2022.12.3) and this bslamp2, and all is fine. big thanks

@hellcry37
Copy link
Author

great for me it botched the second lamp, after trying to update from 2022.11.5 to 2022.12.3 with the config I mention in previous posts I have second lamp down

jesserockz pushed a commit to esphome/esphome that referenced this issue Dec 22, 2022
Co-authored-by: Maurice Makaay <maurice@h2b.nl>
fixes mmakaay/esphome-xiaomi_bslamp2#104
@hellcry37
Copy link
Author

Updated a working lamp from 2022.11.5 directly to 2022.12.3 with config edited to ref: release/2022.12.0 for me brake the lamp.

I was able to bring it back just by flashing it via serial again with:
esp32:
framework:
sdkconfig_options:
CONFIG_FREERTOS_UNICORE: y

and then flash it again with removed / disabled CONFIG_FREERTOS_UNICORE

@Progaros
Copy link

same problem here: "phy_init: failed to load RF calibration data" after flashing resulting in bootloop

@mmakaay
Copy link
Owner

mmakaay commented Dec 22, 2022

What do the configuration that you have used look like?
The fixed new firmware that I tested, and that got confirmed by @vil1driver too, worked for me.
The important thing is that the esp32: section of the configuration is conform the example ini from the 2022.12.0 release of the firmware

@Progaros The restore procedure can be found in this message

BTW:

The fix that I did for the current release, is forcing ESPHome to use an older version of the ESP-IDF framework. To support switching to the new framework version, a PR was accepted today for ESPHome. Therefore, the next release of ESPHome will make life a bit better. It should make the firmware compatible with both the older and the newer version of ESP-IDF.

At the same time, I have submitted a feature request for the ESP-IDF framework, which would allow us to fix the underlying issue for once and for all: disabling the CRC check for the burnt-in MAC address, because the bslamp2 devices contain an invalid CRC. The boot loops originate from the ESP-IDF aborting the boot process, because it detects the invalid CRC. The logic being: maybe the correct data can be read after a reboot.
Let's hope that this request will be picked up and implemented soon.
If peple want to leave likes for this feature quest, here's the link to it: espressif/esp-idf#10401

@Progaros
Copy link

thank you so much @mmakaay the restore procedure worked after >10 tries

now everything is back to normal

@mmakaay
Copy link
Owner

mmakaay commented Dec 23, 2022

Wow, that took quite some tries. I salute your persistence, @Progaros 😄

@szafran81
Copy link

szafran81 commented Jan 6, 2023

EDIT: I've finally managed to flash a working firmware on the first lamp. Thenk you for the fix.

@Jearde
Copy link

Jearde commented Jan 29, 2023

Today I got it to work with the ESPHome Add-On inside RPi Home Assistant, doing the following steps:
Base Commit: 116e425

  1. Commit out some versioning in /packages/core.yaml
esp32:
  board: esp32doit-devkit-v1
  framework:
    type: esp-idf
    sdkconfig_options:
      CONFIG_FREERTOS_UNICORE: y
    advanced:
      ignore_efuse_mac_crc: true
    # Bugfix for ESPHome 2022.12.0 and up: fallback to older platform
    # version, to prevent bricked devices. ESPHome uses newer versions
    # by default.
    # See also: https://github.com/mmakaay/esphome-xiaomi_bslamp2/issues/104
#     version: 4.3.2
#     source: ~3.40302.0
#     platform_version: platformio/espressif32 @ 3.5.0
  1. Install ESPHome (dev) version in Home Assistant.
  2. Install using Manual Download -> Legacy Format
  3. Flash with esphome-flasher

@mmakaay
Copy link
Owner

mmakaay commented Jan 31, 2023

What version of ESPHome does that use then? These changes were specifically made for making things work with the latest ESPHome versions. Commenting them out ought to break things when on 2022.12.0+.

@Jearde
Copy link

Jearde commented Jan 31, 2023

I had to commit out the versions, because a specified tool chain is not supported for 'linux_aarch64'. This is, however, the case with Home Assistant running on a Raspberry Pi 4 with the provided image from HA. Maybe it works in Docker versions of HA.

First, I got the following error when compiling the ESPHome firmware.
Error: Could not find the package with 'espressif/toolchain-xtensa-esp32 @ 8.4.0+2021r2-patch2' requirements for your system 'linux_aarch64' (Not sure about the exact version of xtensa. I didn't save the log. This information is based on my past Google searches.)

After making the changes to the /packages/core.yaml as mentioned above, I got the boot loop as described in 1356893265.

I couldn't use your fix with the manual older versions, because they are not available for my RPi system.
After you commented 1363428442 that your merge request was accepted, I changed my ESPHome Add-On to the dev branch inside Home Assistant.
I flashed the lamp again and no boot loop or any other problems were present.

What version of ESPHome does that use then? These changes were specifically made for making things work with the latest ESPHome versions. Commenting them out ought to break things when on 2022.12.0+.

EDIT:
I used the following version of ESPHome: 034b47c23a08f9980bdae07bcafbcf22fc43dc2e

It is used by the HA add-on: esphome/home-assistant-addon

@mmakaay
Copy link
Owner

mmakaay commented Feb 4, 2023

Beware that although my request for a change in the ESP-IDF framework was accepted and the related PR was merged, the change is not yet in the ESP-IDF releases. The next 4.4.* and 5.* releases of ESP-IDF will likely contain the change.

Unless the change was backported into an already released version of ESP-IDF, don't think that you would be able to benefit from it. Especially since the change would require an additional bit of configuration in the device YAML, to make use of it.

Bottom line: I can't explain why it worked for you, but I'm glad it did ;-)

@labodj
Copy link

labodj commented Feb 15, 2023

@mmakaay I just want to report you that your pull request esphome/esphome#4204 has been merged in EspHome 2022.2.0 release https://github.com/esphome/esphome/releases/tag/2023.2.0

@mmakaay
Copy link
Owner

mmakaay commented Feb 17, 2023

Yeah, based on that, I can now cook up a new release in which life can be made a little bit better.
I also planned to do some documentation updates, now api: password: has been deprecated in favor of api: encryption: key:.

Next big step for a final fix for this stuff will be when my requested change in the ESP-IDF framework (supporting the broken MAC address CRC behaviour of these lamps directly from the ESP-IDF framework) is included in the released framework library version. As I understand, that will be both the next v4.4 and v5 versions of ESP-IDF 🎉 From then on, I don't need the hacky CONFIG_ESP(32)_PHY_CALIBRATION_AND_DATA_STORAGE option anymore.

@randybb
Copy link

randybb commented Feb 17, 2023

I am running latest dev without the part about special versions (as I have mentioned in "our" discord thread), so probably it is already there.

CarlesLlobet added a commit to CarlesLlobet/esphome-xiaomi_bslamp2 that referenced this issue Apr 22, 2023
Trying to flash without old dependency.

Testing if prone to issue mmakaay#104
@hellcry37
Copy link
Author

this can be closed, as fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants